Using extreme phenotype sampling to identify the rare causal variants of quantitative traits in association studies

Variants identified in recent genome‐wide association studies based on the common‐disease common‐variant hypothesis are far from fully explaining the hereditability of complex traits. Rare variants may, in part, explain some of the missing hereditability. Here, we explored the advantage of the extre...

Full description

Saved in:
Bibliographic Details
Published inGenetic epidemiology Vol. 35; no. 8; pp. 790 - 799
Main Authors Li, Dalin, Lewinger, Juan Pablo, Gauderman, William J., Murcray, Cassandra Elizabeth, Conti, David
Format Journal Article
LanguageEnglish
Published Hoboken Wiley Subscription Services, Inc., A Wiley Company 01.12.2011
Wiley Subscription Services, Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Variants identified in recent genome‐wide association studies based on the common‐disease common‐variant hypothesis are far from fully explaining the hereditability of complex traits. Rare variants may, in part, explain some of the missing hereditability. Here, we explored the advantage of the extreme phenotype sampling in rare‐variant analysis and refined this design framework for future large‐scale association studies on quantitative traits. We first proposed a power calculation approach for a likelihood‐based analysis method. We then used this approach to demonstrate the potential advantages of extreme phenotype sampling for rare variants. Next, we discussed how this design can influence future sequencing‐based association studies from a cost‐efficiency (with the phenotyping cost included) perspective. Moreover, we discussed the potential of a two‐stage design with the extreme sample as the first stage and the remaining nonextreme subjects as the second stage. We demonstrated that this two‐stage design is a cost‐efficient alternative to the one‐stage cross‐sectional design or traditional two‐stage design. We then discussed the analysis strategies for this extreme two‐stage design and proposed a corresponding design optimization procedure. To address many practical concerns, for example measurement error or phenotypic heterogeneity at the very extremes, we examined an approach in which individuals with very extreme phenotypes are discarded. We demonstrated that even with a substantial proportion of these extreme individuals discarded, an extreme‐based sampling can still be more efficient. Finally, we expanded the current analysis and design framework to accommodate the CMC approach where multiple rare‐variants in the same gene region are analyzed jointly. Genet. Epidemiol. 2011. © 2011 Wiley Periodicals, Inc. 35:790‐799, 2011
Bibliography:National Institute of Environmental Health Sciences - No. ES015090; No. GM069890
National Institute on Drug Abuse - No. DA020830; No. CA084735
ArticleID:GEPI20628
ark:/67375/WNG-CDV3DD1B-6
National Human Genome Research Institute - No. U01HG005927
istex:011C48329278B91C8BCCD851E7D19AFAFFFCD6BD
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
ISSN:0741-0395
1098-2272
1098-2272
DOI:10.1002/gepi.20628