Burden Testing of Rare Variants Identified through Exome Sequencing via Publicly Available Control Data

The genetic causes of many Mendelian disorders remain undefined. Factors such as lack of large multiplex families, locus heterogeneity, and incomplete penetrance hamper these efforts for many disorders. Previous work suggests that gene-based burden testing—where the aggregate burden of rare, protein...

Full description

Saved in:
Bibliographic Details
Published inAmerican journal of human genetics Vol. 103; no. 4; pp. 522 - 534
Main Authors Guo, Michael H., Plummer, Lacey, Chan, Yee-Ming, Hirschhorn, Joel N., Lippincott, Margaret F.
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 04.10.2018
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The genetic causes of many Mendelian disorders remain undefined. Factors such as lack of large multiplex families, locus heterogeneity, and incomplete penetrance hamper these efforts for many disorders. Previous work suggests that gene-based burden testing—where the aggregate burden of rare, protein-altering variants in each gene is compared between case and control subjects—might overcome some of these limitations. The increasing availability of large-scale public sequencing databases such as Genome Aggregation Database (gnomAD) can enable burden testing using these databases as controls, obviating the need for additional control sequencing for each study. However, there exist various challenges with using public databases as controls, including lack of individual-level data, differences in ancestry, and differences in sequencing platforms and data processing. To illustrate the approach of using public data as controls, we analyzed whole-exome sequencing data from 393 individuals with idiopathic hypogonadotropic hypogonadism (IHH), a rare disorder with significant locus heterogeneity and incomplete penetrance against control subjects from gnomAD (n = 123,136). We leveraged presumably benign synonymous variants to calibrate our approach. Through iterative analyses, we systematically addressed and overcame various sources of artifact that can arise when using public control data. In particular, we introduce an approach for highly adaptable variant quality filtering that leads to well-calibrated results. Our approach “re-discovered” genes previously implicated in IHH (FGFR1, TACR3, GNRHR). Furthermore, we identified a significant burden in TYRO3, a gene implicated in hypogonadotropic hypogonadism in mice. Finally, we developed a user-friendly software package TRAPD (Test Rare vAriants with Public Data) for performing gene-based burden testing against public databases.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
These authors contributed equally to this work
Present address: Department of Medicine, University of North Carolina Hospitals, Chapel Hill, NC 25779, USA
ISSN:0002-9297
1537-6605
1537-6605
DOI:10.1016/j.ajhg.2018.08.016