Algorithmic Learning for Auto-deconvolution of GC-MS Data to Enable Molecular Networking within GNPS

Gas chromatography-mass spectrometry (GC-MS) represents an analytical technique with significant practical societal impact. Spectral deconvolution is an essential step for interpreting GC-MS data. No public GC-MS repositories that also enable repository-scale analysis exist, in part because deconvol...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Aksenov, Alexander, Laponogov, Ivan, Zhang, Zheng, Doran, Sophie, Belluomo, Ilaria, Veselkov, Dennis, Bittremieux, Wout, Louis Felix Nothias, Nothias-Esposito, Melissa, Maloney, Katherine N, Misra, Biswapriya Biswavas, Melnik, Alexey V, Jones, Kenneth L, Dorrestein, Kathleen, Morgan Panitchpakdi, Ernst, Madeleine, Justin Jj Van Der Hooft, Gonzalez, Mabel, Carazzone, Chiara, Amezquita, Adolfo, Callewaert, Chris, Morton, Jamie, Quinn, Robert Andrew, Bouslimani, Amina, Andrea Albarracin Orio, Petras, Daniel, Smania, Andrea Maria, Couvillion, Sneha P, Burnet, Meagan C, Nicora, Carrie D, Zink, Erika, Metz, Thomas O, Artaev, Viatcheslav, Humston-Fulmer, Elizabeth, Gregor, Rachel, Meijler, Michael M, Mizrahi, Itzhak, Stav Eyal, Anderson, Brooke, Dutton, Rachel J, Lugan, Raphael, Pauline Le Boulch, Guitton, Yann, Prevost, Stephanie, Poirier, Audrey, Gaud Dervilly, Bruno Le Bizec, Fait, Aharon, Sikron, Noga, Song, Chao, Gashu, Kelem, Coras, Roxana, Guma, Monica, Manasson, Julia, Scher, Jose U, Barupal, Dinesh K, Saleh Alseekh, Fernie, Alisdair, Fernie, Alisdair R, Mirnezami, Reza, Vasiliou, Vasilis, Schmid, Robin, Borisov, Roman S, Kulikova, Larisa N, Knight, Rob, Wang, Mingxun, Hanna, George, Dorrestein, Pieter C, Veselkov, Kirill
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 14.01.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Gas chromatography-mass spectrometry (GC-MS) represents an analytical technique with significant practical societal impact. Spectral deconvolution is an essential step for interpreting GC-MS data. No public GC-MS repositories that also enable repository-scale analysis exist, in part because deconvolution requires significant user input. We therefore engineered a scalable machine learning workflow for the Global Natural Product Social Molecular Networking (GNPS) analysis platform to enable the mass spectrometry community to store, process, share, annotate, compare, and perform molecular networking of GC-MS data. The workflow performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization, using a Fast Fourier Transform-based strategy to overcome scalability limitations. We introduce a "balance score" that quantifies the reproducibility of fragmentation patterns across all samples. We demonstrate the utility of the platform with breathomics analysis applied to the early detection of oesophago-gastric cancer, and by creating the first molecular spatial map of the human volatilome. Footnotes * https://www.youtube.com/watch?v=yrru-5nrsdk&feature=youtu.be * https://www.youtube.com/watch?v=MblruOSglgI&feature=youtu.be * https://www.youtube.com/watch?v=iX03r_mGi2Q&feature=youtu.be * https://www.youtube.com/watch?v=mv-fw2zSgss&feature=youtu.be * https://www.youtube.com/watch?v=nUhCZ9LwoM4&feature=youtu.be * https://www.youtube.com/watch?v=_PehOiBqzzY&feature=youtu.be
DOI:10.1101/2020.01.13.905091