CD BioGlyco proudly provides clients with Glycoinformatics-assisted Glycan-Molecular Interaction Analysis Service. Our protein-glycan interaction predictive modelling services focus on predicting protein-glycan interactions using state-of-the-art bioinformatics and computational biology techniques.
Protein sequence data: We first collect protein sequence information from public databases or client-provided data. These data are used for subsequent feature extraction and model training.
Glycan structural data: We collect structural information about glycans, including their glycan sequence, glycan composition, linkage patterns, and possible modifications.
Preprocessing: We use computational methods to identify potential glycan binding sites on proteins. This involves an in-depth analysis of 3D protein structure, which allows us to predict where these interactions are likely to occur.
The glycoinformatics model is a deep learning (DL) model we developed specifically for predicting protein-glycan interactions. This model combines the advantages of convolutional neural networks and can process protein sequence information and glycan structural information simultaneously. The glycoinformatics model is crucial in our service because it can predict interactions between proteins and glycans with extremely high accuracy. Based on the input data, the glycoinformatics model proposes potential interactions by taking into account spatial structure, molecular composition, and energetic factors. Our model is calibrated on a vast database of known protein-glycan interactions, making it highly reliable and accurate.
Feature extraction: The glycoinformatics model first extracts key features from protein sequence and glycan structure data. For proteins, we extract their amino acid sequence, secondary structure, physicochemical properties, and other characteristics. For glycans, we extract characteristics such as their glycosyl composition, connection mode, and spatial conformation.
DL network construction: The extracted features are input into the DL network of the glycoinformatics model. After the neural network processes the sequence information of the protein and the structural information of the glycan separately, the outputs are fused through a fully connected layer to predict the interaction between the protein and the glycan.
Model training and optimization: We use a large amount of known protein-glycan interaction data to train the glycoinformatics model. We continuously adjust the parameters of the model to minimize the prediction error through the backpropagation algorithm and gradient descent optimizer. At the same time, we use techniques such as cross-validation to evaluate the performance of the model to ensure its generalization ability on unknown data.
The trained glycoinformatics model can be used to predict novel protein-glycan interactions. Clients only need to provide protein sequence and glycan structure data, and our model quickly gives prediction results.
Technical support: We provide clients with around-the-clock technical support services to answer questions and doubts that clients encounter during use.
Results visualization and interpretation: We display the prediction results to clients in the form of intuitive and easy-to-understand charts to facilitate clients to understand the prediction results quickly. At the same time, our expert team interprets and analyzes the forecast results and provides clients with detailed interpretation reports.
Technology: Development of DL model for predicting non-covalent carbohydrate-binding sites on proteins
Journal: Frontiers in Bioinformatics
IF: 3.7
Published: 2023
Results: The authors introduced two DL models known as CArbohydrate-Protein interaction Site IdentIFier (CAPSIF) that were designed to predict non-covalent carbohydrate-binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). They found that both models surpassed previous surrogate methods for carbohydrate-binding site prediction, with CAPSIF:V demonstrating superior performance compared to CAPSIF:G. Further testing on AlphaFold2-predicted protein structures revealed that CAPSIF:V performed consistently well on both experimentally determined structures and AlphaFold2-predicted structures.
Fig.1 Two DL models that predict the location of protein-carbohydrate binding. (Canner, et al., 2023)
CD BioGlyco continuously updates our methods and technologies to reflect the latest advances in the study of protein-glycan interaction predictive modelling. This ensures that our predictive modelling services stay ahead of the curve and deliver the most precise, accurate, and effective forecasts. Please feel free to contact us if you are interested in our protein-glycan interaction predictive modelling service.
Reference
We envision a future where the intricate world of carbohydrate is no longer shrouded in mystery, but rather illuminated by the power of cutting-edge computational tools.