Are 2D fingerprints still valuable for drug discovery?

Recently, molecular fingerprints extracted from three-dimensional (3D) structures using advanced mathematics, such as algebraic topology, differential geometry, and graph theory have been paired with efficient machine learning, especially deep learning algorithms to outperform other methods in drug...

Full description

Saved in:
Bibliographic Details
Published inPhysical chemistry chemical physics : PCCP Vol. 22; no. 16; pp. 8373 - 839
Main Authors Gao, Kaifu, Nguyen, Duc Duy, Sresht, Vishnu, Mathiowetz, Alan M, Tu, Meihua, Wei, Guo-Wei
Format Journal Article
LanguageEnglish
Published England Royal Society of Chemistry 29.04.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recently, molecular fingerprints extracted from three-dimensional (3D) structures using advanced mathematics, such as algebraic topology, differential geometry, and graph theory have been paired with efficient machine learning, especially deep learning algorithms to outperform other methods in drug discovery applications and competitions. This raises the question of whether classical 2D fingerprints are still valuable in computer-aided drug discovery. This work considers 23 datasets associated with four typical problems, namely protein-ligand binding, toxicity, solubility and partition coefficient to assess the performance of eight 2D fingerprints. Advanced machine learning algorithms including random forest, gradient boosted decision tree, single-task deep neural network and multitask deep neural network are employed to construct efficient 2D-fingerprint based models. Additionally, appropriate consensus models are built to further enhance the performance of 2D-fingerprint-based methods. It is demonstrated that 2D-fingerprint-based models perform as well as the state-of-the-art 3D structure-based models for the predictions of toxicity, solubility, partition coefficient and protein-ligand binding affinity based on only ligand information. However, 3D structure-based models outperform 2D fingerprint-based methods in complex-based protein-ligand binding affinity predictions. Recently, low-dimensional mathematical representations have overshadowed other methods in drug discovery. This work reassesses eight 2D fingerprints on 23 molecular datasets and reveals that they can perform as well as mathematical representations in tasks involving only small molecules.
Bibliography:10.1039/d0cp00305k
Electronic supplementary information (ESI) available. See DOI
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1463-9076
1463-9084
DOI:10.1039/d0cp00305k