New machine learning and physics-based scoring functions for drug discovery

Scoring functions are essential for modern in silico drug discovery. However, the accurate prediction of binding affinity by scoring functions remains a challenging task. The performance of scoring functions is very heterogeneous across different target classes. Scoring functions based on precise ph...

Full description

Saved in:
Bibliographic Details
Published inScientific reports Vol. 11; no. 1; pp. 3198 - 19
Main Authors Guedes, Isabella A., Barreto, André M. S., Marinho, Diogo, Krempser, Eduardo, Kuenemann, Mélaine A., Sperandio, Olivier, Dardenne, Laurent E., Miteva, Maria A.
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 04.02.2021
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Scoring functions are essential for modern in silico drug discovery. However, the accurate prediction of binding affinity by scoring functions remains a challenging task. The performance of scoring functions is very heterogeneous across different target classes. Scoring functions based on precise physics-based descriptors better representing protein–ligand recognition process are strongly needed. We developed a set of new empirical scoring functions, named DockTScore, by explicitly accounting for physics-based terms combined with machine learning. Target-specific scoring functions were developed for two important drug targets, proteases and protein–protein interactions, representing an original class of molecules for drug discovery. Multiple linear regression (MLR), support vector machine and random forest algorithms were employed to derive general and target-specific scoring functions involving optimized MMFF94S force-field terms, solvation and lipophilic interactions terms, and an improved term accounting for ligand torsional entropy contribution to ligand binding. DockTScore scoring functions demonstrated to be competitive with the current best-evaluated scoring functions in terms of binding energy prediction and ranking on four DUD-E datasets and will be useful for in silico drug design for diverse proteins as well as for specific targets such as proteases and protein–protein interactions. Currently, the MLR DockTScore is available at www.dockthor.lncc.br .
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-021-82410-1