CD BioGlyco has been at the forefront of providing comprehensive Glycoinformatics-assisted Analysis/Prediction Services for over a decade, offering unparalleled expertise and cutting-edge technology to support clients in their research on glycans. The workflow of our highly recognized glycoinformatics-assisted mass-based glycan structure prediction service is as follows.
Initially, we gather and arrange annotated spectral data from mass, which are derived from glycoconjugates encompassing various major types obtained from previous glycan experiments, such as N-linked, O-linked, and glycolipids. This dataset enables machine learning to understand the fragmentation patterns and tendencies (i.e., intensity ratios) of the spectra for predicting glycan structures.
We proceed to train the model using this dataset. Throughout the training process, our experts employ a diverse range of machine learning algorithms and deep learning techniques to ensure that the model can comprehensively comprehend the intricate information in the mass data and accurately predict the sugar structure. The model with high accuracy and stability is ultimately achieved after extensive experimentation and fine-tuning.
Finally, our experts utilize a Python-based workflow we developed to convert raw data into a predictable format and make predictions of sugar structure. This workflow also includes automated data filtering and fragment annotation steps. Our experts also strive to further optimize our models to analyze and predict more complex structures for you, further reduce false positive rates, and estimate relative abundance.
Technology: Tandem mass spectrum
Journal: Nature Methods
IF: 47.99
Published: 2024
Results: This article focuses on a deep learning-based approach to predict glycosyl structures from tandem mass spectrum data. Glycosylation is a complex post-translational modification that plays a regulatory role in protein activity in health and disease. However, structural annotation from tandem mass spectrum data is a bottleneck in glycomics research, limiting high-throughput studies and restricting glycomics research to a few experts. In this study, a new filtered tandem mass spectrum spectral dataset of 500,000 annotations was used for training, and a deep residual neural network called CandyCrunch was proposed that can predict the glycan-based structure from raw liquid chromatography-tandem mass spectrum data in seconds (top-1 accuracy of 90.3%).
Fig.1 Fragmentation mechanism uncovered by molecular dynamics. (Urban, et al., 2024)
With a proven track record of success, CD BioGlyco helps numerous clients advance their understanding of glycans by providing a reliable glycoinformatics-assisted mass-based glycan structure prediction service. If you are interested in further details, please feel free to contact us.
Reference
We envision a future where the intricate world of carbohydrate is no longer shrouded in mystery, but rather illuminated by the power of cutting-edge computational tools.