Optimizing and benchmarking polygenic risk scores with GWAS summary statistics

Polygenic risk score (PRS) is a major research topic in human genetics. However, a significant gap exists between PRS methodology and applications in practice due to often unavailable individual-level data for various PRS tasks including model fine-tuning, benchmarking, and ensemble learning. We int...

Full description

Saved in:
Bibliographic Details
Published inGenome Biology Vol. 25; no. 1; pp. 260 - 28
Main Authors Zhao, Zijie, Gruenloh, Tim, Yan, Meiyi, Wu, Yixuan, Sun, Zhongxuan, Miao, Jiacheng, Wu, Yuchang, Song, Jie, Lu, Qiongshi
Format Journal Article
LanguageEnglish
Published England BioMed Central 08.10.2024
BMC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Polygenic risk score (PRS) is a major research topic in human genetics. However, a significant gap exists between PRS methodology and applications in practice due to often unavailable individual-level data for various PRS tasks including model fine-tuning, benchmarking, and ensemble learning. We introduce an innovative statistical framework to optimize and benchmark PRS models using summary statistics of genome-wide association studies. This framework builds upon our previous work and can fine-tune virtually all existing PRS models while accounting for linkage disequilibrium. In addition, we provide an ensemble learning strategy named PUMAS-ensemble to combine multiple PRS models into an ensemble score without requiring external data for model fitting. Through extensive simulations and analysis of many complex traits in the UK Biobank, we demonstrate that this approach closely approximates gold-standard analytical strategies based on external validation, and substantially outperforms state-of-the-art PRS methods. Our method is a powerful and general modeling technique that can continue to combine the best-performing PRS methods out there through ensemble learning and could become an integral component for all future PRS applications.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1474-760X
1474-7596
1474-760X
DOI:10.1186/s13059-024-03400-w