Algebraic graph-assisted bidirectional transformers for molecular property prediction

The ability of molecular property prediction is of great significance to drug discovery, human health, and environmental protection. Despite considerable efforts, quantitative prediction of various molecular properties remains a challenge. Although some machine learning models, such as bidirectional...

Full description

Saved in:
Bibliographic Details
Published inNature communications Vol. 12; no. 1; pp. 3521 - 9
Main Authors Chen, Dong, Gao, Kaifu, Nguyen, Duc Duy, Chen, Xin, Jiang, Yi, Wei, Guo-Wei, Pan, Feng
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 10.06.2021
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The ability of molecular property prediction is of great significance to drug discovery, human health, and environmental protection. Despite considerable efforts, quantitative prediction of various molecular properties remains a challenge. Although some machine learning models, such as bidirectional encoder from transformer, can incorporate massive unlabeled molecular data into molecular representations via a self-supervised learning strategy, it neglects three-dimensional (3D) stereochemical information. Algebraic graph, specifically, element-specific multiscale weighted colored algebraic graph, embeds complementary 3D molecular information into graph invariants. We propose an algebraic graph-assisted bidirectional transformer (AGBT) framework by fusing representations generated by algebraic graph and bidirectional transformer, as well as a variety of machine learning algorithms, including decision trees, multitask learning, and deep neural networks. We validate the proposed AGBT framework on eight molecular datasets, involving quantitative toxicity, physical chemistry, and physiology datasets. Extensive numerical experiments have shown that AGBT is a state-of-the-art framework for molecular property prediction. Despite considerable efforts, quantitative prediction of various molecular properties remains a challenge. Here, the authors propose an algebraic graph-assisted bidirectional transformer, which can incorporate massive unlabeled molecular data into molecular representations via a self-supervised learning strategy and assisted with 3D stereochemical information from graphs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-021-23720-w