ISiCLE: A molecular collision cross section calculation pipeline for establishing large in silico reference libraries for compound identification
Comprehensive and confident identifications of metabolites and other chemicals in complex samples will revolutionize our understanding of the role these chemically diverse molecules play in biological systems. Despite recent advances, metabolomics studies still result in the detection of a dispropor...
Saved in:
Main Authors | , , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
21.09.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Comprehensive and confident identifications of metabolites and other
chemicals in complex samples will revolutionize our understanding of the role
these chemically diverse molecules play in biological systems. Despite recent
advances, metabolomics studies still result in the detection of a
disproportionate number of features than cannot be confidently assigned to a
chemical structure. This inadequacy is driven by the single most significant
limitation in metabolomics: the reliance on reference libraries constructed by
analysis of authentic reference chemicals. To this end, we have developed the
in silico chemical library engine (ISiCLE), a high-performance
computing-friendly cheminformatics workflow for generating libraries of
chemical properties. In the instantiation described here, we predict probable
three-dimensional molecular conformers using chemical identifiers as input,
from which collision cross sections (CCS) are derived. The approach employs
state-of-the-art first-principles simulation, distinguished by use of molecular
dynamics, quantum chemistry, and ion mobility calculations to generate
structures and libraries, all without training data. Importantly, optimization
of ISiCLE included a refactoring of the popular MOBCAL code for
trajectory-based mobility calculations, improving its computational efficiency
by over two orders of magnitude. Calculated CCS values were validated against
1,983 experimentally-measured CCS values and compared to previously reported
CCS calculation approaches. An online database is introduced for sharing both
calculated and experimental CCS values (metabolomics.pnnl.gov), initially
including a CCS library with over 1 million entries. Finally, three successful
applications of molecule characterization using calculated CCS are described.
This work represents a promising method to address the limitations of small
molecule identification. |
---|---|
DOI: | 10.48550/arxiv.1809.08378 |