Multiple kernel learning with random effects for predicting longitudinal outcomes and data integration

Predicting disease risk and progression is one of the main goals in many clinical research studies. Cohort studies on the natural history and etiology of chronic diseases span years and data are collected at multiple visits. Although, kernel‐based statistical learning methods are proven to be powerf...

Full description

Saved in:

Bibliographic Details
Published in	Biometrics Vol. 71; no. 4; pp. 918 - 928
Main Authors	Chen, Tianle, Zeng, Donglin, Wang, Yuanjia
Format	Journal Article
Language	English
Published	United States International Biometric Society, etc. 01.12.2015 Blackwell Publishing Ltd International Biometric Society
Subjects	Algorithms Alzheimer disease Alzheimer Disease - diagnosis Alzheimer Disease - etiology Alzheimer's disease biometry Biometry - methods chronic diseases Clinical outcomes cohort studies Computer Simulation DISCUSSION PAPER Disease prediction Disease Progression epidemiological studies etiology Humans Huntington Disease - diagnosis Huntington Disease - etiology Huntingtons disease image analysis Integrative analysis Latent effects Longitudinal Studies Machine Learning - statistics & numerical data Models, Statistical natural history prediction risk Statistical learning Statistics, Nonparametric Support Vector Machine - statistics & numerical data Integrative analysis Latent effects Statistical learning Disease prediction
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Predicting disease risk and progression is one of the main goals in many clinical research studies. Cohort studies on the natural history and etiology of chronic diseases span years and data are collected at multiple visits. Although, kernel‐based statistical learning methods are proven to be powerful for a wide range of disease prediction problems, these methods are only well studied for independent data, but not for longitudinal data. It is thus important to develop time‐sensitive prediction rules that make use of the longitudinal nature of the data. In this paper, we develop a novel statistical learning method for longitudinal data by introducing subject‐specific short‐term and long‐term latent effects through a designed kernel to account for within‐subject correlation of longitudinal measurements. Since the presence of multiple sources of data is increasingly common, we embed our method in a multiple kernel learning framework and propose a regularized multiple kernel statistical learning with random effects to construct effective nonparametric prediction rules. Our method allows easy integration of various heterogeneous data sources and takes advantage of correlation among longitudinal measures to increase prediction power. We use different kernels for each data source taking advantage of the distinctive feature of each data modality, and then optimally combine data across modalities. We apply the developed methods to two large epidemiological studies, one on Huntington's disease and the other on Alzheimer's Disease (Alzheimer's Disease Neuroimaging Initiative, ADNI) where we explore a unique opportunity to combine imaging and genetic data to study prediction of mild cognitive impairment, and show a substantial gain in performance while accounting for the longitudinal aspect of the data.
Bibliography:	http://dx.doi.org/10.1111/biom.12343 ArticleID:BIOM12343 ark:/67375/WNG-5TC7SNRX-C istex:2C3257F1FC74D8457BB19A4994600A901497F6DE ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0006-341X 1541-0420 1541-0420
DOI:	10.1111/biom.12343