Agnostic Framework for the Classification/Identification of Organisms Based on RNA Post-Transcriptional Modifications

We propose a novel approach for building a classification/identification framework based on the full complement of RNA post-transcriptional modifications (rPTMs) expressed by an organism at basal conditions. The approach relies on advanced mass spectrometry techniques to characterize the products of...

Full description

Saved in:
Bibliographic Details
Published inAnalytical chemistry (Washington) Vol. 93; no. 22; pp. 7860 - 7869
Main Authors McIntyre, William D, Nemati, Reza, Salehi, Mehraveh, Aldrich, Colin C, FitzGibbon, Molly, Deng, Limin, Pazos, Manuel A, Rose, Rebecca E, Toro, Botros, Netzband, Rachel E, Pager, Cara T, Robinson, Ingrid P, Bialosuknia, Sean M, Ciota, Alexander T, Fabris, Daniele
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 08.06.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We propose a novel approach for building a classification/identification framework based on the full complement of RNA post-transcriptional modifications (rPTMs) expressed by an organism at basal conditions. The approach relies on advanced mass spectrometry techniques to characterize the products of exonuclease digestion of total RNA extracts. Sample profiles comprising identities and relative abundances of all detected rPTM were used to train and test the capabilities of different machine learning (ML) algorithms. Each algorithm proved capable of identifying rigorous decision rules for differentiating closely related classes and correctly assigning unlabeled samples. The ML classifiers resolved different members of the Enterobacteriaceae family, alternative Escherichia coli serotypes, a series of Saccharomyces cerevisiae knockout mutants, and primary cells of the Homo sapiens central nervous system, which shared very similar genetic backgrounds. The excellent levels of accuracy and resolving power achieved by training on a limited number of classes were successfully replicated when the number of classes was significantly increased to escalate complexity. A dendrogram generated from ML-curated data exhibited a hierarchical organization that closely resembled those afforded by established taxonomic systems. Finer clustering patterns revealed the extensive effects induced by the deletion of a single pivotal gene. This information provided a putative roadmap for exploring the roles of rPTMs in their respective regulatory networks, which will be essential to decipher the epitranscriptomics code. The ubiquitous presence of RNA in virtually all living organisms promises to enable the broadest possible range of applications, with significant implications in the diagnosis of RNA-related diseases.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0003-2700
1520-6882
DOI:10.1021/acs.analchem.1c00359