Deep generative modeling of the human proteome reveals over a hundred novel genes involved in rare genetic disorders

Identifying causal mutations accelerates genetic disease diagnosis, and therapeutic development. Missense variants present a bottleneck in genetic diagnoses as their effects are less straightforward than truncations or nonsense mutations. While computational prediction methods are increasingly succe...

Full description

Saved in:
Bibliographic Details
Published inmedRxiv : the preprint server for health sciences
Main Authors Orenbuch, Rose, Kollasch, Aaron W, Spinner, Hansen D, Shearer, Courtney A, Hopf, Thomas A, Franceschi, Dinko, Dias, Mafalda, Frazer, Jonathan, Marks, Debora S
Format Journal Article
LanguageEnglish
Published United States 28.11.2023
Online AccessGet more information

Cover

Loading…
More Information
Summary:Identifying causal mutations accelerates genetic disease diagnosis, and therapeutic development. Missense variants present a bottleneck in genetic diagnoses as their effects are less straightforward than truncations or nonsense mutations. While computational prediction methods are increasingly successful at prediction for variants in disease genes, they do not generalize well to other genes as the scores are not calibrated across the proteome. To address this, we developed a deep generative model, popEVE, that combines evolutionary information with population sequence data and achieves state-of-the-art performance at ranking variants by severity to distinguish patients with severe developmental disorders from potentially healthy individuals. popEVE identifies 442 genes in a cohort of developmental disorder cases, including evidence of 119 novel genetic disorders without the need for gene-level enrichment and without overestimating the prevalence of pathogenic variants in the population. By placing variants on a unified scale, our model offers a comprehensive perspective on the distribution of fitness effects across the entire proteome and the broader human population. popEVE provides compelling evidence for genetic diagnoses even in exceptionally rare single-patient disorders where conventional techniques relying on repeated observations may not be applicable. Interactive web viewer and downloads available at pop.evemodel.org .