Fast and Flexible Protein Design Using Deep Graph Neural Networks

Protein structure and function is determined by the arrangement of the linear sequence of amino acids in 3D space. We show that a deep graph neural network, ProteinSolver, can precisely design sequences that fold into a predetermined shape by phrasing this challenge as a constraint satisfaction prob...

Full description

Saved in:
Bibliographic Details
Published inCell systems Vol. 11; no. 4; pp. 402 - 411.e4
Main Authors Strokach, Alexey, Becerra, David, Corbi-Verge, Carles, Perez-Riba, Albert, Kim, Philip M.
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 21.10.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Protein structure and function is determined by the arrangement of the linear sequence of amino acids in 3D space. We show that a deep graph neural network, ProteinSolver, can precisely design sequences that fold into a predetermined shape by phrasing this challenge as a constraint satisfaction problem (CSP), akin to Sudoku puzzles. We trained ProteinSolver on over 70,000,000 real protein sequences corresponding to over 80,000 structures. We show that our method rapidly designs new protein sequences and benchmark them in silico using energy-based scores, molecular dynamics, and structure prediction methods. As a proof-of-principle validation, we use ProteinSolver to generate sequences that match the structure of serum albumin, then synthesize the top-scoring design and validate it in vitro using circular dichroism. ProteinSolver is freely available at http://design.proteinsolver.org and https://gitlab.com/ostrokach/proteinsolver. A record of this paper’s transparent peer review process is included in the Supplemental Information. [Display omitted] •Graph neural network generates new proteins with predetermined topologies•Probabilities assigned to individual amino acids correlate with stability of mutants•Probabilities assigned to amino acid sequences correlate with stability of designs•Orders of magnitude faster than traditional approaches Strokach et al. developed ProteinSolver, a graph convolutional neural network trained on the PDB and sequences in UniParc to reconstruct amino acid sequences that adhere to constraints imposed by protein topologies. It can generate new sequences that fold into predetermined shapes and predict effects of mutations on stability.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2405-4712
2405-4720
DOI:10.1016/j.cels.2020.08.016