DECO: decompose heterogeneous population cohorts for patient stratification and discovery of sample biomarkers using omic data profiling

Abstract Motivation Patient and sample diversity is one of the main challenges when dealing with clinical cohorts in biomedical genomics studies. During last decade, several methods have been developed to identify biomarkers assigned to specific individuals or subtypes of samples. However, current m...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics Vol. 35; no. 19; pp. 3651 - 3662
Main Authors	Campos-Laborie, F J, Risueño, A, Ortiz-Estévez, M, Rosón-Burgo, B, Droste, C, Fontanillo, C, Loos, R, Sánchez-Santos, J M, Trotter, M W, De Las Rivas, J
Format	Journal Article
Language	English
Published	England Oxford University Press 01.10.2019
Subjects	Original Papers
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract Motivation Patient and sample diversity is one of the main challenges when dealing with clinical cohorts in biomedical genomics studies. During last decade, several methods have been developed to identify biomarkers assigned to specific individuals or subtypes of samples. However, current methods still fail to discover markers in complex scenarios where heterogeneity or hidden phenotypical factors are present. Here, we propose a method to analyze and understand heterogeneous data avoiding classical normalization approaches of reducing or removing variation. Results DEcomposing heterogeneous Cohorts using Omic data profiling (DECO) is a method to find significant association among biological features (biomarkers) and samples (individuals) analyzing large-scale omic data. The method identifies and categorizes biomarkers of specific phenotypic conditions based on a recurrent differential analysis integrated with a non-symmetrical correspondence analysis. DECO integrates both omic data dispersion and predictor–response relationship from non-symmetrical correspondence analysis in a unique statistic (called h-statistic), allowing the identification of closely related sample categories within complex cohorts. The performance is demonstrated using simulated data and five experimental transcriptomic datasets, and comparing to seven other methods. We show DECO greatly enhances the discovery and subtle identification of biomarkers, making it especially suited for deep and accurate patient stratification. Availability and implementation DECO is freely available as an R package (including a practical vignette) at Bioconductor repository (http://bioconductor.org/packages/deco/). Supplementary information Supplementary data are available at Bioinformatics online.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1367-4803 1460-2059 1367-4811
DOI:	10.1093/bioinformatics/btz148