Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs

Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different natures of diverse data modality, data integration is challe...

Full description

Saved in:
Bibliographic Details
Published inNeuroImage (Orlando, Fla.) Vol. 102; pp. 220 - 228
Main Authors Cao, Hongbao, Duan, Junbo, Lin, Dongdong, Shugart, Yin Yao, Calhoun, Vince, Wang, Yu-Ping
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 15.11.2014
Elsevier Limited
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different natures of diverse data modality, data integration is challenging. Here we address the data integration problem by developing a generalized sparse model (GSM) using weighting factors to integrate multi-modality data for biomarker selection. As an example, we applied the GSM model to a joint analysis of two types of schizophrenia data sets: 759,075 SNPs and 153,594 functional magnetic resonance imaging (fMRI) voxels in 208 subjects (92 cases/116 controls). To solve this small-sample–large-variable problem, we developed a novel sparse representation based variable selection (SRVS) algorithm, with the primary aim to identify biomarkers associated with schizophrenia. To validate the effectiveness of the selected variables, we performed multivariate classification followed by a ten-fold cross validation. We compared our proposed SRVS algorithm with an earlier sparse model based variable selection algorithm for integrated analysis. In addition, we compared with the traditional statistics method for uni-variant data analysis (Chi-squared test for SNP data and ANOVA for fMRI data). Results showed that our proposed SRVS method can identify novel biomarkers that show stronger capability in distinguishing schizophrenia patients from healthy controls. Moreover, better classification ratios were achieved using biomarkers from both types of data, suggesting the importance of integrative analysis. •Sparse Representation based variable selection (SRVS) with properties was provided•SRVS was tested integrating both fMRI and SNP data modalities•The SRVS with L0, L1 and L0.5 penalization terms were tested and compared.•SRVS results using integrated data set and separated data sets were compared•Matlab code of SRVS algorithm has been made available online
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Review-3
ISSN:1053-8119
1095-9572
1095-9572
DOI:10.1016/j.neuroimage.2014.01.021