banner
IM-MS Self-Expanding Database Unlocks Glycan Isomer Identification

IM-MS Self-Expanding Database Unlocks Glycan Isomer Identification

May 4, 2026

Analytical Bottlenecks in Glycan Research

Glycans are a dense molecular layer covering the surface of all eukaryotic cells, participating in numerous life processes such as protein folding, cell proliferation and differentiation, immune regulation, fertilization, and pathogen recognition. Abnormal changes in cell surface glycans are closely related to the development of inflammatory bowel disease, septic shock, Alzheimer's disease, Parkinson's disease, and tumors. Furthermore, glycan modifications on biopharmaceutical drugs such as Monoclonal Antibodies directly determine their biological activity and pharmacokinetic properties.

However, Glycan Structure Analysis has always been a major challenge in analytical chemistry. Glycans are composed of monosaccharides linked at different sites and configurations. Many isomers have the same molecular formula but differ in stereochemistry, linkage sites, and branching patterns. These isomers often have drastically different biological functions. For example, avian influenza viruses prefer to recognize α2,3-linked sialic acid glycans, while human influenza viruses use α2,6-linked sialic acid glycans to invade cells. Traditional mass spectrometry techniques, while capable of determining the composition of glycans, cannot easily distinguish between structural isomers.

Ion Mobility-Mass Spectrometry (IM-MS): Adding a One-Dimensional Separation to Glycan Analysis

Ion mobility-mass spectrometry (IM-MS) provides a new dimension to glycan analysis. In IM-MS, gaseous ions pass through a drift tube filled with inert buffer gas under the influence of an electric field. Ions collide with buffer gas molecules, with ions having larger surface areas migrating more slowly. This allows the calculation of the Collision Cross Section (CCS) for each ion. Intact or fragment ions of isomers often have different surface areas, thus exhibiting different CCS values and arrival time distributions (ATD), making it possible to distinguish glycan isomers.

Although IM-MS holds great promise, its application for glycan structure identification requires establishing a vast reference database to correlate mass-to-charge ratio (m/z), ATD, and CCS values with specific isomer structures. Due to the extremely vast structural diversity of natural glycans, synthesizing all possible standards through chemical synthesis is impractical.

Self-Expanding Database: From 19 Standards to 332 Entries

To address the aforementioned limitations, the research team developed a high-throughput glycan identification and de novo sequencing method based on IM-MS. This method requires only a limited number of synthetic standards containing common glyco-epitopes and achieves self-expansion of the database through the following iterative strategy:

First, an initial database is established by subjecting synthetic standards to in-source collision-induced dissociation (CID) to generate intact ions and fragment ions with structural information. Their m/z and CCS values are measured to construct an initial IM-MS reference database.

Next, IM-MS analysis is performed on glycans in biological samples. If the m/z and ATD/CCS values of their intact ions match entries in the database, rapid high-throughput identification can be achieved.

For unknown glycans not yet listed in the database, in-source CID is performed to generate fragment ions. By matching the m/z and CCS values of the fragment ions, the glycosidic bond type and linkage mode between monosaccharides are determined. Combined with known biosynthetic pathway rules, the fragment information is assembled into a complete glycan structure.

Finally, the m/z, ATD, and CCS values of newly identified complete glycans and fragment ions are incorporated into the reference database for high-throughput identification of subsequent samples or de novo sequencing of more complex structures.

Schematic workflow of IM-MS self-expanding database for glycan de novo sequencing and high-throughput identification.

Fig. 1 Schematic overview of the integrated IM-MS high-throughput identification and de novo sequencing methods for glycan structure determination. (Sastre Toraño, et al. 2025)

Using this strategy, the research team expanded the reference database from the initial 19 standards to 332 unique entries using only 12 glycan epitope fragment ions as a starting point. The database continuously accumulates data with each analysis, unlocking more complex glycan structures.

Practical Validation: A Structural Panorama of Human Milk Oligosaccharides

Human Milk Oligosaccharides (HMOs) are the first model system developed using this method. HMOs have probiotic functions, promoting the growth of beneficial bacteria in the infant gut and acting as decoys to protect infants from infection. All HMOs are lactose-based, extending into type II or type I chains through N-acetyllactosamine (LacNAc) or Lacto-N-biose (LNB), and further undergoing fucosylation or Sialylation, resulting in extremely rich structural diversity.

The research team extracted HMOs from samples of five human milk donors and performed LC-IM-MS analysis after derivatization. The results showed that 14 HMOs with known structures could be rapidly identified through direct database matching.

For unknown structures without database entries, de novo sequencing of fragment ions successfully resolved up to 43 of the most abundant HMOs, covering linear and I-branched structures, with the highest degree of polymerization reaching nine monosaccharides (DP9).

The study successfully distinguished isomers of sialyl Lewis x (sLex, type II structure) and sialyl Lewis a (sLea, type I structure), with significant differences in their CCS values (257.1 Å2 vs. 244.2 Å2).

Complex branched topologies were accurately identified by analyzing characteristic fragments and backbone ions of the I-branched structures.

Diagnostic CID fragmentation of linear and I-branched HMO structures into chain-type fragment ions for IM-MS analysis.

Fig. 2 Fragmentation of linear and I-branched structures into diagnostic chain type fragment ions. (Sastre Toraño, et al. 2025)

Based on the resolved fucosylation patterns, the study also classified the five donors according to secretor (Se) and Lewis (Le) status, revealing individual differences in the composition of human milk glycans.

Notably, while theoretically 36 different isomers might exist for a DP9 HMO of this monosaccharide composition, only one major structure and three minor isomers were detected in all donor samples. This indicates that glycosyltransferases in HMO biosynthesis exhibit more refined substrate specificity than previously understood.

Biopharmaceutical Applications: Aflibercept and Transferrin's N-Glycans

To validate the applicability of the method to glycan analysis of biotherapeutic drugs, the research team focused on N-Glycans. N-glycans share a common pentasaccharide core, which can further branch and extend into multi-antenna structures, with terminal modifications similar to glycan epitopes in HMOs.

The research team first analyzed 49 complex N-glycan standards synthesized by Chemoenzymatic Methods, finding that the high-resolution ATD patterns of each glycan exhibited unique multi-peak fingerprints, which can serve as a basis for structural identification. Subsequently, this method was applied to two important glycoproteins:

  • Aflibercept: An Fc fusion protein used to treat macular degeneration and metastatic colorectal cancer. This study directly identified six structures from the released N-glycans, including a glycoform with core fucosylation and α2,3-linked sialic acid, consistent with the known characteristics of Chinese hamster ovary cells (CHO) lacking α2,6-sialyltransferase. For glycans not included in databases, the complete structures of 19 biantennary complex and hybrid N-glycans were resolved by de novosequencing, and the sialic acid linkages on different antennae were distinguished using diagnostic fragment ions F27-F32.
  • Transferrin: Its glycans are biomarkers for various diseases. This study successfully identified isomeric disialylated N-glycans with different combinations of α2,3- and α2,6-sialic acid linkages on transferrin, improving the structural resolution accuracy for its use as a diagnostic marker.

Methodological Advantages and Future Prospects

This approach establishes a new paradigm for glycan structure identification that does not rely on large-scale chemically synthesized standards. Its core advantages are:

  • Cost-Effective and Efficient: It requires only a limited number of synthetic standards to start, avoiding the enormous costs of synthesizing all natural glycans.
  • Continuously Evolving: The database is self-expanding, covering increasingly complex glycan structures as the number of analyzed samples increases.
  • Cross-Platform Applicability: CCS values are intrinsic properties of molecules. Although measurements may vary slightly across different IM-MS instrument platforms, this method can be implemented on various instruments, including drift-tube instruments, traveling wave ion mobility analyzers, and trapped ion mobility spectrometers (TIMS), through standardized calibration procedures.
  • Structural Precision: It can distinguish between linkage isomers, stereoisomers, and branched isomers, providing an accurate structural basis for glycan functional studies.

In the future, this method is expected to be extended to the structural analysis of other glycan conjugates such as O-Glycans and Glycosphingolipids, and will drive the development of glycan diagnostic reagents, targeted therapies, and precision nutrition products (such as customized infant formula). For the biopharmaceutical industry, more accurate glycan characterization will help optimize the quality control of recombinant proteins and antibody therapeutics, ensuring batch-to-batch consistency and safety. Glycans are known as the third language of life, and the IM-MS self-expanding database method is providing an increasingly comprehensive dictionary for deciphering this language.

Related Services & Products

Reference

  1. Sastre Toraño, J., et al. (2025). De novosequencing of glycans by ion mobility-mass spectrometry using a self-expanding database. Nature Communications. DOI: 1038/s41467-025-67069-w.
Similar Posts

About Us

CD BioGlyco is a leading biotechnology company specializing in glycobiology. We deliver high-quality products and services to support cutting-edge research worldwide.

Contact Us

  • For research and manufacturing partners only. Not intended for (direct) human or veterinary use.
Copyright © CD BioGlyco. All rights reserved.
0