Integration of element specific persistent homology and machine learning for protein‐ligand binding affinity prediction

Protein‐ligand binding is a fundamental biological process that is paramount to many other biological processes, such as signal transduction, metabolic pathways, enzyme construction, cell secretion, and gene expression. Accurate prediction of protein‐ligand binding affinities is vital to rational dr...

Full description

Saved in:
Bibliographic Details
Published inInternational journal for numerical methods in biomedical engineering Vol. 34; no. 2
Main Authors Cang, Zixuan, Wei, Guo‐Wei
Format Journal Article
LanguageEnglish
Published England Wiley Subscription Services, Inc 01.02.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Protein‐ligand binding is a fundamental biological process that is paramount to many other biological processes, such as signal transduction, metabolic pathways, enzyme construction, cell secretion, and gene expression. Accurate prediction of protein‐ligand binding affinities is vital to rational drug design and the understanding of protein‐ligand binding and binding induced function. Existing binding affinity prediction methods are inundated with geometric detail and involve excessively high dimensions, which undermines their predictive power for massive binding data. Topology provides the ultimate level of ion and thus incurs too much reduction in geometric information. Persistent homology embeds geometric information into topological invariants and bridges the gap between complex geometry and topology. However, it oversimplifies biological information. This work introduces element specific persistent homology (ESPH) or multicomponent persistent homology to retain crucial biological information during topological simplification. The combination of ESPH and machine learning gives rise to a powerful paradigm for macromolecular analysis. Tests on 2 large data sets indicate that the proposed topology‐based machine‐learning paradigm outperforms other existing methods in protein‐ligand binding affinity predictions. ESPH reveals protein‐ligand binding mechanism that can not be attained from other conventional techniques. The present approach reveals that protein‐ligand hydrophobic interactions are extended to 40Å  away from the binding site, which has a significant ramification to drug and protein design. Most physical models are based on geometry, which leads to high number of degrees of freedom for biomolecular data sets. In contrast, topology is often too to be practically useful. Persistent homology bridges the gap between geometry and topology but neglects biological information. This work introduces element‐specific persistent homology to retain essential biological information during topological simplification. The integration of element‐specific persistent homology and machine learning sheds light on the molecular mechanism of protein‐ligand binding that cannot obtain from other conventional techniques.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2040-7939
2040-7947
2040-7947
DOI:10.1002/cnm.2914