Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection

Motivation: Pathway and gene set-based approaches for the analysis of gene expression profiling experiments have become increasingly popular for addressing problems associated with individual gene analysis. Since most genes are not differently expressed, existing gene set tests, which consider all t...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 25; no. 9; pp. 1145 - 1151
Main Authors Wu, Michael C., Zhang, Lingsong, Wang, Zhaoxi, Christiani, David C., Lin, Xihong
Format Journal Article
LanguageEnglish
Published Oxford Oxford University Press 01.05.2009
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Motivation: Pathway and gene set-based approaches for the analysis of gene expression profiling experiments have become increasingly popular for addressing problems associated with individual gene analysis. Since most genes are not differently expressed, existing gene set tests, which consider all the genes within a gene set, are subject to considerable noise and power loss, a concern exacerbated in studies in which the degree of differential expression is moderate for truly differentially expressed genes. For a significantly differentially expressed pathway, it is also of substantial interest to select important genes that drive the differential expression of the pathway. Methods: We develop a unified framework to jointly test the significance of a pathway and to select a subset of genes that drive the significant pathway effect. To achieve dimension reduction and gene selection, we decompose each gene pathway into a single score by using a regularized form of linear discriminant analysis, called sparse linear discriminant analysis (sLDA). Testing for the significance of the pathway effect proceeds via permutation of the sLDA score. The sLDA-based test is compared with competing approaches with simulations and two applications: a study on the effect of metal fume exposure on immune response and a study of gene expression profiles among Type II Diabetes patients. Results: Our results show that sLDA-based testing provides a powerful approach to test for the significance of a differentially expressed pathway and gene selection. Availability: An implementation of the proposed sLDA-based pathway test in the R statistical computing environment is available at http://www.hsph.harvard.edu/∼mwu/software/ Contact: xlin@hsph.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Bibliography:istex:627CB0AE59BBB82DAAD2DA297A2CD0467ED5FB3C
ArticleID:btp019
To whom correspondence should be addressed.
Associate Editor: David Rocke
ark:/67375/HXZ-Q5WNPMJN-S
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:1367-4803
1367-4811
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btp019