Glycans are a dense molecular layer covering the surface of all eukaryotic cells, participating in numerous life processes such as protein folding, cell proliferation and differentiation, immune regulation, fertilization, and pathogen recognition. Abnormal changes in cell surface glycans are closely related to the development of inflammatory bowel disease, septic shock, Alzheimer's disease, Parkinson's disease, and tumors. Furthermore, glycan modifications on biopharmaceutical drugs such as Monoclonal Antibodies directly determine their biological activity and pharmacokinetic properties.
However, Glycan Structure Analysis has always been a major challenge in analytical chemistry. Glycans are composed of monosaccharides linked at different sites and configurations. Many isomers have the same molecular formula but differ in stereochemistry, linkage sites, and branching patterns. These isomers often have drastically different biological functions. For example, avian influenza viruses prefer to recognize α2,3-linked sialic acid glycans, while human influenza viruses use α2,6-linked sialic acid glycans to invade cells. Traditional mass spectrometry techniques, while capable of determining the composition of glycans, cannot easily distinguish between structural isomers.
Ion mobility-mass spectrometry (IM-MS) provides a new dimension to glycan analysis. In IM-MS, gaseous ions pass through a drift tube filled with inert buffer gas under the influence of an electric field. Ions collide with buffer gas molecules, with ions having larger surface areas migrating more slowly. This allows the calculation of the Collision Cross Section (CCS) for each ion. Intact or fragment ions of isomers often have different surface areas, thus exhibiting different CCS values and arrival time distributions (ATD), making it possible to distinguish glycan isomers.
Although IM-MS holds great promise, its application for glycan structure identification requires establishing a vast reference database to correlate mass-to-charge ratio (m/z), ATD, and CCS values with specific isomer structures. Due to the extremely vast structural diversity of natural glycans, synthesizing all possible standards through chemical synthesis is impractical.
To address the aforementioned limitations, the research team developed a high-throughput glycan identification and de novo sequencing method based on IM-MS. This method requires only a limited number of synthetic standards containing common glyco-epitopes and achieves self-expansion of the database through the following iterative strategy:
First, an initial database is established by subjecting synthetic standards to in-source collision-induced dissociation (CID) to generate intact ions and fragment ions with structural information. Their m/z and CCS values are measured to construct an initial IM-MS reference database.
Next, IM-MS analysis is performed on glycans in biological samples. If the m/z and ATD/CCS values of their intact ions match entries in the database, rapid high-throughput identification can be achieved.
For unknown glycans not yet listed in the database, in-source CID is performed to generate fragment ions. By matching the m/z and CCS values of the fragment ions, the glycosidic bond type and linkage mode between monosaccharides are determined. Combined with known biosynthetic pathway rules, the fragment information is assembled into a complete glycan structure.
Finally, the m/z, ATD, and CCS values of newly identified complete glycans and fragment ions are incorporated into the reference database for high-throughput identification of subsequent samples or de novo sequencing of more complex structures.

Fig. 1 Schematic overview of the integrated IM-MS high-throughput identification and de novo sequencing methods for glycan structure determination. (Sastre Toraño, et al. 2025)
Using this strategy, the research team expanded the reference database from the initial 19 standards to 332 unique entries using only 12 glycan epitope fragment ions as a starting point. The database continuously accumulates data with each analysis, unlocking more complex glycan structures.
Human Milk Oligosaccharides (HMOs) are the first model system developed using this method. HMOs have probiotic functions, promoting the growth of beneficial bacteria in the infant gut and acting as decoys to protect infants from infection. All HMOs are lactose-based, extending into type II or type I chains through N-acetyllactosamine (LacNAc) or Lacto-N-biose (LNB), and further undergoing fucosylation or Sialylation, resulting in extremely rich structural diversity.
The research team extracted HMOs from samples of five human milk donors and performed LC-IM-MS analysis after derivatization. The results showed that 14 HMOs with known structures could be rapidly identified through direct database matching.
For unknown structures without database entries, de novo sequencing of fragment ions successfully resolved up to 43 of the most abundant HMOs, covering linear and I-branched structures, with the highest degree of polymerization reaching nine monosaccharides (DP9).
The study successfully distinguished isomers of sialyl Lewis x (sLex, type II structure) and sialyl Lewis a (sLea, type I structure), with significant differences in their CCS values (257.1 Å2 vs. 244.2 Å2).
Complex branched topologies were accurately identified by analyzing characteristic fragments and backbone ions of the I-branched structures.

Fig. 2 Fragmentation of linear and I-branched structures into diagnostic chain type fragment ions. (Sastre Toraño, et al. 2025)
Based on the resolved fucosylation patterns, the study also classified the five donors according to secretor (Se) and Lewis (Le) status, revealing individual differences in the composition of human milk glycans.
Notably, while theoretically 36 different isomers might exist for a DP9 HMO of this monosaccharide composition, only one major structure and three minor isomers were detected in all donor samples. This indicates that glycosyltransferases in HMO biosynthesis exhibit more refined substrate specificity than previously understood.
To validate the applicability of the method to glycan analysis of biotherapeutic drugs, the research team focused on N-Glycans. N-glycans share a common pentasaccharide core, which can further branch and extend into multi-antenna structures, with terminal modifications similar to glycan epitopes in HMOs.
The research team first analyzed 49 complex N-glycan standards synthesized by Chemoenzymatic Methods, finding that the high-resolution ATD patterns of each glycan exhibited unique multi-peak fingerprints, which can serve as a basis for structural identification. Subsequently, this method was applied to two important glycoproteins:
This approach establishes a new paradigm for glycan structure identification that does not rely on large-scale chemically synthesized standards. Its core advantages are:
In the future, this method is expected to be extended to the structural analysis of other glycan conjugates such as O-Glycans and Glycosphingolipids, and will drive the development of glycan diagnostic reagents, targeted therapies, and precision nutrition products (such as customized infant formula). For the biopharmaceutical industry, more accurate glycan characterization will help optimize the quality control of recombinant proteins and antibody therapeutics, ensuring batch-to-batch consistency and safety. Glycans are known as the third language of life, and the IM-MS self-expanding database method is providing an increasingly comprehensive dictionary for deciphering this language.
Reference
About Us
CD BioGlyco is a leading biotechnology company specializing in glycobiology. We deliver high-quality products and services to support cutting-edge research worldwide.