Discrete mixture modeling to address genetic heterogeneity in time-to-event regression

Motivation: Time-to-event regression models are a critical tool for associating survival time outcomes with molecular data. Despite mounting evidence that genetic subgroups of the same clinical disease exist, little attention has been given to exploring how this heterogeneity affects time-to-event m...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 30; no. 12; pp. 1690 - 1697
Main Authors Eng, Kevin H., Hanlon, Bret M.
Format Journal Article
LanguageEnglish
Published England Oxford University Press 15.06.2014
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Motivation: Time-to-event regression models are a critical tool for associating survival time outcomes with molecular data. Despite mounting evidence that genetic subgroups of the same clinical disease exist, little attention has been given to exploring how this heterogeneity affects time-to-event model building and how to accommodate it. Methods able to diagnose and model heterogeneity should be valuable additions to the biomarker discovery toolset. Results: We propose a mixture of survival functions that classifies subjects with similar relationships to a time-to-event response. This model incorporates multivariate regression and model selection and can be fit with an expectation maximization algorithm, we call Cox-assisted clustering. We illustrate a likely manifestation of genetic heterogeneity and demonstrate how it may affect survival models with little warning. An application to gene expression in ovarian cancer DNA repair pathways illustrates how the model may be used to learn new genetic subsets for risk stratification. We explore the implications of this model for censored observations and the effect on genomic predictors and diagnostic analysis. Availability and implementation: R implementation of CAC using standard packages is available at https://gist.github.com/programeng/8620b85146b14b6edf8f Data used in the analysis are publicly available. Contact: kevin.eng@roswellpark.org Supplementary information: Supplementary data are available at Bioinformatics online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Associate Editor: Gunnar Ratsch
ISSN:1367-4803
1367-4811
1367-4811
DOI:10.1093/bioinformatics/btu065