Structural pre-training improves physical accuracy of antibody structure prediction using deep learning
Protein folding problem obtained a practical solution recently, owing to advances in deep learning. There are classes of proteins though, such as antibodies, that are structurally unique, where the general solution still lacks. In particular, the prediction of the CDR-H3 loop, which is an instrument...
Saved in:
Published in | ImmunoInformatics Vol. 11; p. 100028 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.09.2023
|
Online Access | Get full text |
Cover
Loading…
Summary: | Protein folding problem obtained a practical solution recently, owing to advances in deep learning. There are classes of proteins though, such as antibodies, that are structurally unique, where the general solution still lacks. In particular, the prediction of the CDR-H3 loop, which is an instrumental part of an antibody in its antigen recognition abilities, remains a challenge. Antibody-specific deep learning frameworks were proposed to tackle this problem noting great progress, both on accuracy and speed fronts. Oftentimes though, the original networks produce physically implausible bond geometries that then need to undergo a time-consuming energy minimization process. Here we hypothesized that pre-training the network on a large, augmented set of models with correct physical geometries, rather than a small set of real antibody X-ray structures, would allow the network to learn better bond geometries. We show that fine-tuning such a pre-trained network on a task of shape prediction on real X-ray structures improves the number of correct peptide bond distances, abstracted as the Cα distances. We further demonstrate that pre-training allows the network to produce physically plausible shapes on an artificial set of CDR-H3s, showing the ability to generalize to the vast antibody sequence space. We hope that our strategy will benefit the development of deep learning antibody models that rapidly generate physically plausible geometries, without the burden of time-consuming energy minimization.
[Display omitted] |
---|---|
ISSN: | 2667-1190 |
DOI: | 10.1016/j.immuno.2023.100028 |