Long Time Scale Ensemble Methods in Molecular Dynamics: Ligand–Protein Interactions and Allostery in SARS-CoV‑2 Targets

We subject a series of five protein–ligand systems which contain important SARS-CoV-2 targets, 3-chymotrypsin-like protease (3CLPro), papain-like protease, and adenosine ribose phosphatase, to long time scale and adaptive sampling molecular dynamics simulations. By performing ensembles of ten or twe...

Full description

Saved in:
Bibliographic Details
Published inJournal of chemical theory and computation Vol. 19; no. 11; pp. 3359 - 3378
Main Authors Bhati, Agastya P., Hoti, Art, Potterton, Andrew, Bieniek, Mateusz K., Coveney, Peter V.
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 13.06.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We subject a series of five protein–ligand systems which contain important SARS-CoV-2 targets, 3-chymotrypsin-like protease (3CLPro), papain-like protease, and adenosine ribose phosphatase, to long time scale and adaptive sampling molecular dynamics simulations. By performing ensembles of ten or twelve 10 μs simulations for each system, we accurately and reproducibly determine ligand binding sites, both crystallographically resolved and otherwise, thereby discovering binding sites that can be exploited for drug discovery. We also report robust, ensemble-based observation of conformational changes that occur at the main binding site of 3CLPro due to the presence of another ligand at an allosteric binding site explaining the underlying cascade of events responsible for its inhibitory effect. Using our simulations, we have discovered a novel allosteric mechanism of inhibition for a ligand known to bind only at the substrate binding site. Due to the chaotic nature of molecular dynamics trajectories, regardless of their temporal duration individual trajectories do not allow for accurate or reproducible elucidation of macroscopic expectation values. Unprecedentedly at this time scale, we compare the statistical distribution of protein–ligand contact frequencies for these ten/twelve 10 μs trajectories and find that over 90% of trajectories have significantly different contact frequency distributions. Furthermore, using a direct binding free energy calculation protocol, we determine the ligand binding free energies for each of the identified sites using long time scale simulations. The free energies differ by 0.77 to 7.26 kcal/mol across individual trajectories depending on the binding site and the system. We show that, although this is the standard way such quantities are currently reported at long time scale, individual simulations do not yield reliable free energies. Ensembles of independent trajectories are necessary to overcome the aleatoric uncertainty in order to obtain statistically meaningful and reproducible results. Finally, we compare the application of different free energy methods to these systems and discuss their advantages and disadvantages. Our findings here are generally applicable to all molecular dynamics based applications and not confined to the free energy methods used in this study.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
USDOE
UK EPSRC
AC05-00OR22725; EP/R029598/1; EP/W007762/1; 823712; COMPBIO; COMPBIO2
Software Environment for Actionable & VVUQ-evaluated Exascale Applications (SEAVEA)
European Union’s Horizon 2020 Research and Innovation Programme
ISSN:1549-9618
1549-9626
1549-9626
DOI:10.1021/acs.jctc.3c00020