Scalable Prediction of Acute Myeloid Leukemia Using High-Dimensional Machine Learning and Blood Transcriptomics

Acute myeloid leukemia (AML) is a severe, mostly fatal hematopoietic malignancy. We were interested in whether transcriptomic-based machine learning could predict AML status without requiring expert input. Using 12,029 samples from 105 different studies, we present a large-scale study of machine lea...

Full description

Saved in:
Bibliographic Details
Published iniScience Vol. 23; no. 1; p. 100780
Main Authors Warnat-Herresthal, Stefanie, Perrakis, Konstantinos, Taschler, Bernd, Becker, Matthias, Baßler, Kevin, Beyer, Marc, Günther, Patrick, Schulte-Schrepping, Jonas, Seep, Lea, Klee, Kathrin, Ulas, Thomas, Haferlach, Torsten, Mukherjee, Sach, Schultze, Joachim L.
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 24.01.2020
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Acute myeloid leukemia (AML) is a severe, mostly fatal hematopoietic malignancy. We were interested in whether transcriptomic-based machine learning could predict AML status without requiring expert input. Using 12,029 samples from 105 different studies, we present a large-scale study of machine learning-based prediction of AML in which we address key questions relating to the combination of machine learning and transcriptomics and their practical use. We find data-driven, high-dimensional approaches—in which multivariate signatures are learned directly from genome-wide data with no prior knowledge—to be accurate and robust. Importantly, these approaches are highly scalable with low marginal cost, essentially matching human expert annotation in a near-automated workflow. Our results support the notion that transcriptomics combined with machine learning could be used as part of an integrated -omics approach wherein risk prediction, differential diagnosis, and subclassification of AML are achieved by genomics while diagnosis could be assisted by transcriptomic-based machine learning. [Display omitted] •Study presents one of the largest transcriptomics datasets to date for AML prediction•Effective classifiers can be obtained by high-dimensional machine learning•Accuracy increases with dataset size•Includes challenging scenarios such as cross-study and cross-technology Artificial Intelligence; Biological Sciences; Cancer; Computer Science; Omics; Transcriptomics
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Lead Contact
These authors contributed equally
ISSN:2589-0042
2589-0042
DOI:10.1016/j.isci.2019.100780