An association test of the spatial distribution of rare missense variants within protein structures identifies Alzheimer's disease–related patterns

More than 90% of genetic variants are rare in most modern sequencing studies, such as the Alzheimer's Disease Sequencing Project (ADSP) whole-exome sequencing (WES) data. Furthermore, 54% of the rare variants in ADSP WES are singletons. However, both single variant and unit-based tests are limi...

Full description

Saved in:
Bibliographic Details
Published inGenome research Vol. 32; no. 4; pp. 778 - 790
Main Authors Jin, Bowen, Capra, John A., Benchek, Penelope, Wheeler, Nicholas, Naj, Adam C., Hamilton-Nelson, Kara L., Farrell, John J., Leung, Yuk Yee, Kunkle, Brian, Vadarajan, Badri, Schellenberg, Gerard D., Mayeux, Richard, Wang, Li-San, Farrer, Lindsay A., Pericak-Vance, Margaret A., Martin, Eden R., Haines, Jonathan L., Crawford, Dana C., Bush, William S.
Format Journal Article
LanguageEnglish
Published United States Cold Spring Harbor Laboratory Press 01.04.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:More than 90% of genetic variants are rare in most modern sequencing studies, such as the Alzheimer's Disease Sequencing Project (ADSP) whole-exome sequencing (WES) data. Furthermore, 54% of the rare variants in ADSP WES are singletons. However, both single variant and unit-based tests are limited in their statistical power to detect an association between rare variants and phenotypes. To best use missense rare variants and investigate their biological effect, we examine their association with phenotypes in the context of protein structures. We developed a protein structure–based approach, protein optimized kernel evaluation of missense nucleotides (POKEMON), which evaluates rare missense variants based on their spatial distribution within a protein rather than their allele frequency. The hypothesis behind this test is that the three-dimensional spatial distribution of variants within a protein structure provides functional context to power an association test. POKEMON identified three candidate genes ( TREM2 , SORL1 , and EXOC3L4 ) and another suggestive gene from the ADSP WES data. For TREM2 and SORL1 , two known Alzheimer's disease (AD) genes, the signal from the spatial cluster is stable even if we exclude known AD risk variants, indicating the presence of additional low-frequency risk variants within these genes. EXOC3L4 is a novel AD risk gene that has a cluster of variants primarily shared by case subjects around the Sec6 domain. This cluster is also validated in an independent replication data set and a validation data set with a larger sample size.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1088-9051
1549-5469
1549-5469
DOI:10.1101/gr.276069.121