SARS-CoV2 billion-compound docking

This dataset contains ligand conformations and docking scores for 1.4 billion molecules docked against 6 structural targets from SARS-CoV2, representing 5 unique proteins: MPro, NSP15, PLPro, RDRP, and the Spike protein. Docking was carried out using the AutoDock-GPU platform on the Summit supercomp...

Full description

Saved in:
Bibliographic Details
Published inScientific data Vol. 10; no. 1; p. 173
Main Authors Rogers, David M., Agarwal, Rupesh, Vermaas, Josh V., Smith, Micholas Dean, Rajeshwar, Rajitha T., Cooper, Connor, Sedova, Ada, Boehm, Swen, Baker, Matthew, Glaser, Jens, Smith, Jeremy C.
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 28.03.2023
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This dataset contains ligand conformations and docking scores for 1.4 billion molecules docked against 6 structural targets from SARS-CoV2, representing 5 unique proteins: MPro, NSP15, PLPro, RDRP, and the Spike protein. Docking was carried out using the AutoDock-GPU platform on the Summit supercomputer and Google Cloud. The docking procedure employed the Solis Wets search method to generate 20 independent ligand binding poses per compound. Each compound geometry was scored using the AutoDock free energy estimate, and rescored using RFScore v3 and DUD-E machine-learned rescoring models. Input protein structures are included, suitable for use by AutoDock-GPU and other docking programs. As the result of an exceptionally large docking campaign, this dataset represents a valuable resource for discovering trends across small molecule and protein binding sites, training AI models, and comparing to inhibitor compounds targeting SARS-CoV-2. The work also gives an example of how to organize and process data from ultra-large docking screens. Measurement(s) equilibrium association constant (KA) Technology Type(s) molecular docking by scoring function Factor Type(s) Chemical formula and connectivity Sample Characteristic - Organism Severe acute respiratory syndrome-related coronavirus Sample Characteristic - Environment in-silico
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Undefined-3
USDOE
ISSN:2052-4463
2052-4463
DOI:10.1038/s41597-023-01984-9