A Bayesian semiparametric factor analysis model for subtype identification
Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite many successes, existing clustering methods may not perform...
Saved in:
Published in | Statistical applications in genetics and molecular biology Vol. 16; no. 2; pp. 145 - 158 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Germany
De Gruyter
25.04.2017
Walter de Gruyter GmbH |
Subjects | |
Online Access | Get full text |
ISSN | 2194-6302 1544-6115 1544-6115 |
DOI | 10.1515/sagmb-2016-0051 |
Cover
Loading…
Summary: | Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite many successes, existing clustering methods may not perform well when genes are highly correlated and many uninformative genes are included for clustering due to the high dimensionality. In this article, we introduce a novel subtype identification method in the Bayesian setting based on gene expression profiles. This method, called BCSub, adopts an innovative semiparametric Bayesian factor analysis model to reduce the dimension of the data to a few factor scores for clustering. Specifically, the factor scores are assumed to follow the Dirichlet process mixture model in order to induce clustering. Through extensive simulation studies, we show that BCSub has improved performance over commonly used clustering methods. When applied to two gene expression datasets, our model is able to identify subtypes that are clinically more relevant than those identified from the existing methods. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 2194-6302 1544-6115 1544-6115 |
DOI: | 10.1515/sagmb-2016-0051 |