A quantum chemical interaction energy dataset for accurately modeling protein-ligand interactions
Fast and accurate calculation of intermolecular interaction energies is desirable for understanding many chemical and biological processes, including the binding of small molecules to proteins. The Splinter [“ S ymmetry-adapted perturbation theory (SAPT0) p rotein- l igand inter action”] dataset has...
Saved in:
Published in | Scientific data Vol. 10; no. 1; pp. 619 - 14 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
London
Nature Publishing Group UK
12.09.2023
Nature Publishing Group Nature Portfolio |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Fast and accurate calculation of intermolecular interaction energies is desirable for understanding many chemical and biological processes, including the binding of small molecules to proteins. The Splinter [“
S
ymmetry-adapted perturbation theory (SAPT0)
p
rotein-
l
igand
inter
action”] dataset has been created to facilitate the development and improvement of methods for performing such calculations. Molecular fragments representing commonly found substructures in proteins and small-molecule ligands were paired into >9000 unique dimers, assembled into numerous configurations using an approach designed to adequately cover the breadth of the dimers’ potential energy surfaces while enhancing sampling in favorable regions. ~1.5 million configurations of these dimers were randomly generated, and a structurally diverse subset of these were minimized to obtain an additional ~80 thousand local and global minima. For all >1.6 million configurations, SAPT0 calculations were performed with two basis sets to complete the dataset. It is expected that Splinter will be a useful benchmark dataset for training and testing various methods for the calculation of intermolecular interaction energies. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23 |
ISSN: | 2052-4463 2052-4463 |
DOI: | 10.1038/s41597-023-02443-1 |