The gene expression classifier ALLCatchR identifies B-precursor ALL subtypes and underlying developmental trajectories across age

Current classifications (WHO-HAEM5 / ICC) define up to 26 molecular B-cell precursor acute lymphoblastic leukemia (BCP-ALL) disease subtypes which are defined by genomic driver aberrations and corresponding gene expression signatures. Identification of driver aberrations by RNA-Seq is well establish...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Beder, Thomas, Hansen, Bjoern-Thore, Hartmann, Alina M, Zimmermann, Johannes, Amelunxen, Eric, Wolgast, Nadine, Walter, Wencke, Zaliova, Marketa, Antic, Zeljko, Chouvarine, Philippe, Bartsch, Lorenz, Barz, Malwine, Bultmann, Miriam, Horns, Johanna, Bendig, Sonja, Kaessens, Jan, Kaleta, Christoph, Cario, Gunnar, Schrappe, Martin, Neumann, Martin, Goekbuget, Nicola, Bergmann, Anke Katharina, Trka, Jan, Haferlach, Claudia, Brueggemann, Monika, Baldus, Claudia D, Lorenz Bastian
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 03.02.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Current classifications (WHO-HAEM5 / ICC) define up to 26 molecular B-cell precursor acute lymphoblastic leukemia (BCP-ALL) disease subtypes which are defined by genomic driver aberrations and corresponding gene expression signatures. Identification of driver aberrations by RNA-Seq is well established, while systematic approaches for gene expression analysis are less advanced. Therefore, we developed ALLCatchR, a machine learning based classifier using RNA-Seq expression data to allocate BCP-ALL samples to 21 defined molecular subtypes. Trained on n=1,869 transcriptome profiles with established subtype definitions (4 cohorts; 55% pediatric / 45% adult), ALLCatchR allowed subtype allocation in 3 independent hold-out cohorts (n=1,018; 75% pediatric / 25% adult) with 95.7% accuracy (averaged sensitivity across subtypes: 91.1% / specificity: 99.8%). "High confidence predictions" were achieved in 84.6% of samples with 99.7% accuracy. Only 1.2% of samples remained "unclassified". ALLCatchR outperformed existing tools and identified novel candidates in previously unassigned samples. We established a novel RNA-Seq reference of human B-lymphopoiesis. Implementation in ALLCatchR enabled projection of BCP-ALL samples to this trajectory, which identified shared pattenrs of proximity of BCP-ALL subtypes to normal lymphopoiesis stages. ALLCatchR sustains RNA-Seq routine application in BCP-ALL diagnostics with systematic gene expression analysis for accurate subtype allocations and novel insights into underlying developmental trajectories.Competing Interest StatementThe authors have declared no competing interest.
DOI:10.1101/2023.02.01.526553