Highly accurate protein structure prediction for the human proteome
Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experi...
Saved in:
Published in | Nature (London) Vol. 596; no. 7873; pp. 590 - 596 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
London
Nature Publishing Group UK
26.08.2021
Nature Publishing Group |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure
1
. Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold
2
, at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective.
AlphaFold is used to predict the structures of almost all of the proteins in the human proteome—the availability of high-confidence predicted structures could enable new avenues of investigation from a structural perspective. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0028-0836 1476-4687 1476-4687 |
DOI: | 10.1038/s41586-021-03828-1 |