Multi-resolution characterization of molecular taxonomies in bulk and single-cell transcriptomics data
Abstract As high-throughput genomics assays become more efficient and cost effective, their utilization has become standard in large-scale biomedical projects. These studies are often explorative, in that relationships between samples are not explicitly defined a priori, but rather emerge from data-...
Saved in:
Published in | Nucleic acids research Vol. 49; no. 17; p. e98 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
27.09.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Abstract
As high-throughput genomics assays become more efficient and cost effective, their utilization has become standard in large-scale biomedical projects. These studies are often explorative, in that relationships between samples are not explicitly defined a priori, but rather emerge from data-driven discovery and annotation of molecular subtypes, thereby informing hypotheses and independent evaluation. Here, we present K2Taxonomer, a novel unsupervised recursive partitioning algorithm and associated R package that utilize ensemble learning to identify robust subgroups in a ‘taxonomy-like’ structure. K2Taxonomer was devised to accommodate different data paradigms, and is suitable for the analysis of both bulk and single-cell transcriptomics, and other ‘-omics’, data. For each of these data types, we demonstrate the power of K2Taxonomer to discover known relationships in both simulated and human tissue data. We conclude with a practical application on breast cancer tumor infiltrating lymphocyte (TIL) single-cell profiles, in which we identified co-expression of translational machinery genes as a dominant transcriptional program shared by T cells subtypes, associated with better prognosis in breast cancer tissue bulk expression data.
Graphical Abstract
Graphical Abstract
Overview of the K2Taxonomer algorithm, performance evaluation (simulated, bulk, single-cell data), and in-silico study identifying translation as a transcriptional program shared across a diverse subgroup of breast tumor infiltrating immunocytes. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0305-1048 1362-4962 |
DOI: | 10.1093/nar/gkab552 |