A unified powerful set-based test for sequencing data analysis of GxE interactions
SummaryThe development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation particularly rare variants to complex diseases. To date many sequencing analyses of rare variants have focused on marginal genetic effects and have...
Saved in:
Published in | Biostatistics (Oxford, England) Vol. 18; no. 1; pp. 119 - 131 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
01.01.2017
|
Subjects | |
Online Access | Get full text |
ISSN | 1465-4644 1468-4357 |
DOI | 10.1093/biostatistics/kxw034 |
Cover
Abstract | SummaryThe development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation particularly rare variants to complex diseases. To date many sequencing analyses of rare variants have focused on marginal genetic effects and have not explored the potential role environmental factors play in modifying genetic risk. Analysis of gene-environment interaction (GxE) for rare variants poses considerable challenges because of variant rarity and paucity of subjects who carry the variants while being exposed. To tackle this challenge, we propose a hierarchical model to jointly assess the GxE effects of a set of rare variants for example, in a gene or regulatory region, leveraging the information across the variants. Under this model, GxE is modeled by two components. The first component incorporates variant functional information as weights to calculate the weighted burden of variant alleles across variants, and then assess their GxE interaction with the environmental factor. Since this information is a priori known, this component is fixed effects in the model. The second component involves residual GxE effects that have not been accounted for by the fixed effects. In this component, the residual GxE effects are postulated to follow an unspecified distribution with mean 0 and variance [Formula: see text] We develop a novel testing procedure by deriving two independent score statistics for the fixed effects and the variance component separately. We propose two data-adaptive combination approaches for combining these two score statistics and establish the asymptotic distributions. An extensive simulation study shows that the proposed approaches maintain the correct type I error and the power is comparable to or better than existing methods under a wide range of scenarios. Finally we illustrate the proposed methods by a exome-wide GxE analysis with NSAIDs use in colorectal cancer. |
---|---|
AbstractList | SummaryThe development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation particularly rare variants to complex diseases. To date many sequencing analyses of rare variants have focused on marginal genetic effects and have not explored the potential role environmental factors play in modifying genetic risk. Analysis of gene-environment interaction (GxE) for rare variants poses considerable challenges because of variant rarity and paucity of subjects who carry the variants while being exposed. To tackle this challenge, we propose a hierarchical model to jointly assess the GxE effects of a set of rare variants for example, in a gene or regulatory region, leveraging the information across the variants. Under this model, GxE is modeled by two components. The first component incorporates variant functional information as weights to calculate the weighted burden of variant alleles across variants, and then assess their GxE interaction with the environmental factor. Since this information is a priori known, this component is fixed effects in the model. The second component involves residual GxE effects that have not been accounted for by the fixed effects. In this component, the residual GxE effects are postulated to follow an unspecified distribution with mean 0 and variance [Formula: see text] We develop a novel testing procedure by deriving two independent score statistics for the fixed effects and the variance component separately. We propose two data-adaptive combination approaches for combining these two score statistics and establish the asymptotic distributions. An extensive simulation study shows that the proposed approaches maintain the correct type I error and the power is comparable to or better than existing methods under a wide range of scenarios. Finally we illustrate the proposed methods by a exome-wide GxE analysis with NSAIDs use in colorectal cancer. The development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation particularly rare variants to complex diseases. To date many sequencing analyses of rare variants have focused on marginal genetic effects and have not explored the potential role environmental factors play in modifying genetic risk. Analysis of gene–environment interaction (GxE) for rare variants poses considerable challenges because of variant rarity and paucity of subjects who carry the variants while being exposed. To tackle this challenge, we propose a hierarchical model to jointly assess the GxE effects of a set of rare variants for example, in a gene or regulatory region, leveraging the information across the variants. Under this model, GxE is modeled by two components. The first component incorporates variant functional information as weights to calculate the weighted burden of variant alleles across variants, and then assess their GxE interaction with the environmental factor. Since this information is a priori known, this component is fixed effects in the model. The second component involves residual GxE effects that have not been accounted for by the fixed effects. In this component, the residual GxE effects are postulated to follow an unspecified distribution with mean 0 and variance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$\tau^2$\end{document} . We develop a novel testing procedure by deriving two independent score statistics for the fixed effects and the variance component separately. We propose two data-adaptive combination approaches for combining these two score statistics and establish the asymptotic distributions. An extensive simulation study shows that the proposed approaches maintain the correct type I error and the power is comparable to or better than existing methods under a wide range of scenarios. Finally we illustrate the proposed methods by a exome-wide GxE analysis with NSAIDs use in colorectal cancer. |
Author | Hsu, Li Su, Yu-Ru Di, Chong-Zhi |
Author_xml | – sequence: 1 givenname: Yu-Ru surname: Su fullname: Su, Yu-Ru – sequence: 2 givenname: Chong-Zhi surname: Di fullname: Di, Chong-Zhi – sequence: 3 givenname: Li surname: Hsu fullname: Hsu, Li |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/27474101$$D View this record in MEDLINE/PubMed |
BookMark | eNp9kd1KAzEQhYMo1qpvIJIXWJu_ze56IRSpVSgIotdhNptodE3qJrX69m6tivXCqxlm5nwM5wzRtg_eIHREyQklFR_VLsQEycXkdBw9vS0JF1tojwpZZoLnxfZnn2dCCjFAwxgfCWGMS76LBqwQhaCE7qGbMV54Z51p8DwsTWcXLY4mZTXEfpRMTNiGrh-9LIzXzt_jBhJg8NC-RxdxsHj6NsHOJ9OBTi74eIB2LLTRHH7VfXR3Mbk9v8xm19Or8_Es04KUKaOgBeiiqWpmBWVarF4qGXBmpQXRgJEyrzkpdMUkSColryprSwNVXpW84vvobM2dL-pn02jjUwetmnfuGbp3FcCpzY13D-o-vKqc5TnJSQ84_g34UX670x-crg90F2LsjFXarSwPK55rFSVqFYXaiEKto-jF4o_4m_-v7ANsMZb1 |
CitedBy_id | crossref_primary_10_1158_2767_9764_CRC_21_0119 crossref_primary_10_1016_j_xgen_2024_100591 crossref_primary_10_1038_s41598_023_28172_4 crossref_primary_10_1093_aje_kwx227 crossref_primary_10_1093_aje_kwx228 crossref_primary_10_3389_fgene_2021_710055 crossref_primary_10_3389_fgene_2021_682638 crossref_primary_10_1093_g3journal_jkae263 crossref_primary_10_1158_1055_9965_EPI_19_1018 crossref_primary_10_1002_sim_8037 crossref_primary_10_1093_jnci_djac094 crossref_primary_10_1016_j_ebiom_2024_105146 crossref_primary_10_1038_s41576_024_00731_z crossref_primary_10_1002_gepi_22351 crossref_primary_10_1002_gepi_22273 crossref_primary_10_1093_biostatistics_kxad004 crossref_primary_10_1007_s10519_021_10058_8 crossref_primary_10_1007_s40471_018_0135_2 crossref_primary_10_1002_gepi_22348 crossref_primary_10_1371_journal_pgen_1008081 crossref_primary_10_1002_cam4_2971 crossref_primary_10_1111_biom_13407 crossref_primary_10_1038_s41598_022_23451_y crossref_primary_10_1002_sim_8446 crossref_primary_10_1038_s41380_019_0627_6 crossref_primary_10_1038_s41416_023_02312_z crossref_primary_10_1194_jlr_P119000226 |
Cites_doi | 10.1155/2015/143712 10.1002/gepi.21908 10.1214/aos/1015957397 10.1093/biostatistics/kxs014 10.1111/biom.12368 10.1017/CBO9780511801389 10.1093/biostatistics/kxt006 10.1093/biostatistics/kxs015 10.1016/j.ajhg.2011.07.007 10.1007/BF02985802 10.1159/000312643 10.1053/j.gastro.2012.12.020 10.1002/gepi.21735 10.1016/j.csda.2008.11.025 10.1016/j.cell.2007.10.052 10.1016/j.amjcard.2004.02.058 10.1002/gepi.21610 10.1038/nrg2764 |
ContentType | Journal Article |
Copyright | The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 2016 |
Copyright_xml | – notice: The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. – notice: The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 2016 |
CorporateAuthor | Genetics and Epidemiology of Colorectal Cancer Consortium |
CorporateAuthor_xml | – name: Genetics and Epidemiology of Colorectal Cancer Consortium |
DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM 5PM |
DOI | 10.1093/biostatistics/kxw034 |
DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed PubMed Central (Full Participant titles) |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) |
DatabaseTitleList | MEDLINE |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1468-4357 |
EndPage | 131 |
ExternalDocumentID | PMC5255050 27474101 10_1093_biostatistics_kxw034 |
Genre | Journal Article |
GrantInformation_xml | – fundername: NCI NIH HHS grantid: R01 CA189532 – fundername: NCI NIH HHS grantid: P01 CA053996 – fundername: NCI NIH HHS grantid: R01 CA195789 |
GroupedDBID | --- -E4 .2P .I3 0R~ 1TH 23N 2WC 4.4 48X 53G 5GY 5VS 5WA 6PF 70D AAIJN AAJKP AAJQQ AAMVS AAOGV AAPQZ AAPXW AARHZ AAUAY AAUQX AAVAP AAWTL AAYXX ABDFA ABDTM ABEJV ABEUO ABGNP ABIXL ABJNI ABLJU ABNKS ABPQP ABPTD ABQLI ABVGC ABWST ABXVV ABZBJ ACGFS ACIPB ACIWK ACPRK ACUFI ACUXJ ACYTK ADBBV ADEYI ADEZT ADGZP ADHKW ADHZD ADIPN ADNBA ADOCK ADQBN ADRDM ADRTK ADVEK ADYJX ADYVW ADZXQ AECKG AEGPL AEJOX AEKKA AEKSI AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFIYH AFOFC AFRAH AGINJ AGKEF AGORE AGQXC AGSYK AHGBF AHMBA AHXPO AIJHB AJBYB AJEEA AJEUX AJNCP ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC ALXQX ANAKG APIBT APWMN ATGXG AXUDD AZVOD BAWUL BAYMD BCRHZ BEYMZ BHONS BQUQU BTQHN C1A C45 CAG CDBKE CITATION COF CS3 CZ4 DAKXR DIK DILTD DU5 D~K E3Z EBD EBS EE~ EJD EMOBN F5P F9B FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC H13 H5~ HAR HW0 HZ~ IOX J21 JXSIZ KBUDW KOP KQ8 KSI KSN M-Z N9A NGC NMDNZ NOMLY NTWIH NU- O0~ O9- ODMLO OJQWA OJZSN OK1 OVD P2P PAFKI PEELM PQQKQ Q1. Q5Y RD5 RIG RNI ROL ROX RUSNO RW1 RXO RZO SV3 TEORI TJP TN5 TR2 W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~91 CGR CUY CVF ECM EIF NPM 5PM |
ID | FETCH-LOGICAL-c408t-1ac4ac7d9b2f412c4474182a32f6fa4dae665b307c926a6166399ff8ea9598393 |
ISSN | 1465-4644 |
IngestDate | Thu Aug 21 18:27:37 EDT 2025 Mon Jul 21 06:02:50 EDT 2025 Tue Jul 01 03:45:54 EDT 2025 Thu Apr 24 23:07:20 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Keywords | Score test Rare genetic variants Burden and variance component tests Colorectal cancer Kernel machine |
Language | English |
License | The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c408t-1ac4ac7d9b2f412c4474182a32f6fa4dae665b307c926a6166399ff8ea9598393 |
OpenAccessLink | https://academic.oup.com/biostatistics/article-pdf/18/1/119/9607760/kxw034.pdf |
PMID | 27474101 |
PageCount | 13 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_5255050 pubmed_primary_27474101 crossref_citationtrail_10_1093_biostatistics_kxw034 crossref_primary_10_1093_biostatistics_kxw034 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2017-01-01 |
PublicationDateYYYYMMDD | 2017-01-01 |
PublicationDate_xml | – month: 01 year: 2017 text: 2017-01-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | Biostatistics (Oxford, England) |
PublicationTitleAlternate | Biostatistics |
PublicationYear | 2017 |
Publisher | Oxford University Press |
Publisher_xml | – name: Oxford University Press |
References | 2017011904250978000_18.1.119.8 2017011904250978000_18.1.119.9 2017011904250978000_18.1.119.4 2017011904250978000_18.1.119.6 2017011904250978000_18.1.119.7 2017011904250978000_18.1.119.15 2017011904250978000_18.1.119.1 2017011904250978000_18.1.119.16 2017011904250978000_18.1.119.2 Hastie (2017011904250978000_18.1.119.5) 2005; 27 2017011904250978000_18.1.119.17 2017011904250978000_18.1.119.3 2017011904250978000_18.1.119.18 2017011904250978000_18.1.119.19 2017011904250978000_18.1.119.10 2017011904250978000_18.1.119.11 2017011904250978000_18.1.119.12 2017011904250978000_18.1.119.13 2017011904250978000_18.1.119.14 22699862 - Biostatistics. 2012 Sep;13(4):762-75 26273586 - Biomed Res Int. 2015;2015:143712 20606458 - Hum Hered. 2010;70(2):132-40 26229047 - Biometrics. 2016 Mar;72 (1):156-64 18160036 - Cell. 2007 Dec 28;131(7):1248-59 20212493 - Nat Rev Genet. 2010 Apr;11(4):259-72 23266556 - Gastroenterology. 2013 Apr;144(4):799-807.e24 23462021 - Biostatistics. 2013 Sep;14(4):667-81 22714933 - Genet Epidemiol. 2012 Apr;36(3):183-94 22734045 - Biostatistics. 2012 Sep;13(4):776-90 21835306 - Am J Hum Genet. 2011 Aug 12;89(2):277-88 23720162 - Genet Epidemiol. 2013 Jul;37(5):452-64 15194016 - Am J Cardiol. 2004 Jun 15;93(12):1473-80 26095235 - Genet Epidemiol. 2015 Dec;39(8):609-18 |
References_xml | – ident: 2017011904250978000_18.1.119.3 doi: 10.1155/2015/143712 – ident: 2017011904250978000_18.1.119.8 doi: 10.1002/gepi.21908 – ident: 2017011904250978000_18.1.119.9 doi: 10.1214/aos/1015957397 – ident: 2017011904250978000_18.1.119.10 doi: 10.1093/biostatistics/kxs014 – ident: 2017011904250978000_18.1.119.12 doi: 10.1111/biom.12368 – ident: 2017011904250978000_18.1.119.2 doi: 10.1017/CBO9780511801389 – ident: 2017011904250978000_18.1.119.11 doi: 10.1093/biostatistics/kxt006 – ident: 2017011904250978000_18.1.119.1 doi: 10.1093/biostatistics/kxs015 – ident: 2017011904250978000_18.1.119.18 doi: 10.1016/j.ajhg.2011.07.007 – ident: 2017011904250978000_18.1.119.4 – volume: 27 start-page: 83 year: 2005 ident: 2017011904250978000_18.1.119.5 article-title: The elements of statistical learning: data mining, inference and prediction publication-title: The Mathematical Intelligencer doi: 10.1007/BF02985802 – ident: 2017011904250978000_18.1.119.15 doi: 10.1159/000312643 – ident: 2017011904250978000_18.1.119.14 doi: 10.1053/j.gastro.2012.12.020 – ident: 2017011904250978000_18.1.119.7 doi: 10.1002/gepi.21735 – ident: 2017011904250978000_18.1.119.13 doi: 10.1016/j.csda.2008.11.025 – ident: 2017011904250978000_18.1.119.16 doi: 10.1016/j.cell.2007.10.052 – ident: 2017011904250978000_18.1.119.19 doi: 10.1016/j.amjcard.2004.02.058 – ident: 2017011904250978000_18.1.119.6 doi: 10.1002/gepi.21610 – ident: 2017011904250978000_18.1.119.17 doi: 10.1038/nrg2764 – reference: 20606458 - Hum Hered. 2010;70(2):132-40 – reference: 18160036 - Cell. 2007 Dec 28;131(7):1248-59 – reference: 20212493 - Nat Rev Genet. 2010 Apr;11(4):259-72 – reference: 22699862 - Biostatistics. 2012 Sep;13(4):762-75 – reference: 15194016 - Am J Cardiol. 2004 Jun 15;93(12):1473-80 – reference: 23462021 - Biostatistics. 2013 Sep;14(4):667-81 – reference: 22714933 - Genet Epidemiol. 2012 Apr;36(3):183-94 – reference: 26273586 - Biomed Res Int. 2015;2015:143712 – reference: 21835306 - Am J Hum Genet. 2011 Aug 12;89(2):277-88 – reference: 23266556 - Gastroenterology. 2013 Apr;144(4):799-807.e24 – reference: 22734045 - Biostatistics. 2012 Sep;13(4):776-90 – reference: 23720162 - Genet Epidemiol. 2013 Jul;37(5):452-64 – reference: 26229047 - Biometrics. 2016 Mar;72 (1):156-64 – reference: 26095235 - Genet Epidemiol. 2015 Dec;39(8):609-18 |
SSID | ssj0022363 |
Score | 2.2792056 |
Snippet | SummaryThe development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation... The development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation particularly... |
SourceID | pubmedcentral pubmed crossref |
SourceType | Open Access Repository Index Database Enrichment Source |
StartPage | 119 |
SubjectTerms | Gene-Environment Interaction Humans Models, Genetic Models, Statistical Sequence Analysis, DNA - statistics & numerical data |
Title | A unified powerful set-based test for sequencing data analysis of GxE interactions |
URI | https://www.ncbi.nlm.nih.gov/pubmed/27474101 https://pubmed.ncbi.nlm.nih.gov/PMC5255050 |
Volume | 18 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELaWIiQuVXl2eckHbiu368Rx4uOqKhQQHEorVVxWjh80AmWrbiIWfj3j2HETqIByiXa9WSvJfJn5PJ4HQi95wbWmhhMwJ5wwbTSRjOVEU1WITGZClG5H9_0HfnTK3p5lZ5NJO8wuaco99ePavJL_kSqMgVxdluwNJBsnhQH4DPKFI0gYjv8k48WsrSvrSOSFa3bmwo3XpiHOMukZcMimCyIM0dLOJ-DiQWdyUIfk9eawqxhx6fMb1qM93mrl0o1CJWdXlnTTR8KH1h8DN8LHtlPmLTluIzeu_H7-qv5MPp1XEUDr1jsDhg4Hmg8cDl5HMp4Rxn3Zxj3TjxUEmFd-vWKNAPJakgYt6Q0u9WbgN13u61yVw3uF71823-bB_Tkqnv2LUYuhhn6TPV2O5ln6WW6h20me-939N-_iOj1JuwZ88T77jEuR7o9m2fezjBhNpDHjENsBZznZQdthsYEXHjn30MTU99Ed3370-wN0vMABP7jHD474wQ4_GASOr_CDHX5wjx-8shjwg4f4eYhOXx2eHByR0GODKDYvGkKlYlLlWpSJZTRR8J4yWHLKNLHcSqal4TwrwRAokXDJKXeM1trCSJEJINfpI7RVr2qzi7CwWQl8P9WJoi5oQ0pplSmkArMh5opNUdo_p6UKBehdH5Svyz_JaIpI_NeFL8Dyl_Mf-ycfz3Z-FwaWZ4rykUziCa7C-viXujrvKq1niVvAz5_c8BqeortXL84ztNVctuY5cNemfNEh7SeugqSD |
linkProvider | Flying Publisher |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+unified+powerful+set-based+test+for+sequencing+data+analysis+of+GxE+interactions&rft.jtitle=Biostatistics+%28Oxford%2C+England%29&rft.au=Su%2C+Yu-Ru&rft.au=Di%2C+Chong-Zhi&rft.au=Hsu%2C+Li&rft.date=2017-01-01&rft.issn=1465-4644&rft.eissn=1468-4357&rft.volume=18&rft.issue=1&rft.spage=119&rft.epage=131&rft_id=info:doi/10.1093%2Fbiostatistics%2Fkxw034&rft.externalDBID=n%2Fa&rft.externalDocID=10_1093_biostatistics_kxw034 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1465-4644&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1465-4644&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1465-4644&client=summon |