CD BioGlyco has been committed to providing trusted Glycoinformatics-assisted Structural and Functional Prediction Services to our clients for many years, among which our expertise in Nuclear Magnetic Resonance (NMR)-based, infrared spectroscopy (IR)-based, and Mass-based Glycan Structure Prediction Service is well-recognized. The process of our glycoinformatics-assisted IR-based glycan structure prediction service is as follows.
First, we record the IR of analogous compounds of mass-selected glycan samples. By analyzing the IR spectra of these samples, we obtain information on the characteristics of their absorbed and scattered light at different wavelengths, thus providing data support for subsequent predictions.
Subsequently, we combine the spectra of these data to construct a training library for the RF classifier. By integrating spectral data from multiple samples and combining them with relevant attribute information, we build a comprehensive and representative training library in which various possible scenarios and features are included. The RF classifier is used to distinguish between the glycan samples to be predicted and different functional group positions such as 2-O-, 4-O-, and 6-O-. When utilizing the RF algorithm for classification and identification, we take into account the fact that there may be complex and subtle differences between the different functional group positions, which need to be accurately captured and judged during the model construction process.
In addition, during the training process, considering the high correlation of spectral data, we use feature selection and evolutionary algorithms to reduce the size of the feature space. By screening out the most representative and critical quality parameters and optimizing the model parameters with evolutionary algorithms to avoid the overfitting problem, we effectively reduce the computational cost and improve the prediction effect based on the accuracy and stability of the model.
Technology: Helium nanodroplet spectra and IR spectra
Journal: Journal of the American Chemical Society
IF: 14.4
Published: 2023
Results: This article focuses on the use of low-temperature IR and RF methods to predict the structural patterns of glycosaminoglycans (GAGs). The researchers used the RF model to predict the structural patterns of GAGs. First, they recorded low-temperature gas-phase IR spectra of mass-selected heparin sulfate (HS) disaccharide, tetrasaccharide, and hexasaccharide ions to extract vibrational features associated with structural pattern features. They then combined these data with the chondroitin sulfate (CS) disaccharide spectra to form a training library for the RF classifier. By optimizing data preprocessing and RF modeling, they achieved over 97% prediction accuracy for HS tetrasaccharides and hexasaccharides based on a training set of only 21 spectra. Thus, by using an RF model, researchers were able to predict the structural patterns of GAGs based on the vibrational features of gas-phase cryogenic IR spectroscopy.
Fig.1 Application of RF for analyzing IR spectra of GAGs. (Riedel, et al., 2023)
CD BioGlyco uses a variety of glycobiological techniques to offer personalized glycoinformatics-assisted IR-based glycan structure prediction services to our clients. Our dedication is to constantly improve our technology to make significant progress in glycoinformatics-assisted glycan research. If you are interested in the details of our services, please feel free to contact us.
Reference
We envision a future where the intricate world of carbohydrate is no longer shrouded in mystery, but rather illuminated by the power of cutting-edge computational tools.