Predicting the emission wavelength of organic molecules using a combinatorial QSAR and machine learning approach

Organic fluorescent molecules play critical roles in fluorescence inspection, biological probes, and labeling indicators. More than ten thousand organic fluorescent molecules were imported in this study, followed by a machine learning based approach for extracting the intrinsic structural characteri...

Full description

Saved in:
Bibliographic Details
Published inRSC advances Vol. 1; no. 4; pp. 23834 - 23841
Main Authors Ye, Zong-Rong, Huang, I.-Shou, Chan, Yu-Te, Li, Zhong-Ji, Liao, Chen-Cheng, Tsai, Hao-Rong, Hsieh, Meng-Chi, Chang, Chun-Chih, Tsai, Ming-Kang
Format Journal Article
LanguageEnglish
Published England Royal Society of Chemistry 23.06.2020
The Royal Society of Chemistry
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Organic fluorescent molecules play critical roles in fluorescence inspection, biological probes, and labeling indicators. More than ten thousand organic fluorescent molecules were imported in this study, followed by a machine learning based approach for extracting the intrinsic structural characteristics that were found to correlate with the fluorescence emission. A systematic informatics procedure was introduced, starting from descriptor cleaning, descriptor space reduction, and statistical-meaningful regression to build a broad and valid model for estimating the fluorescence emission wavelength. The least absolute shrinkage and selection operator (Lasso) regression coupling with the random forest model was finally reported as the numerical predictor as well as being fulfilled with the statistical criteria. Such an informatics model appeared to bring comparable predictive ability, being complementary to the conventional time-dependent density functional theory method in emission wavelength prediction, however, with a fractional computational expense. The combinatorial QSAR and machine learning approach provides the qualitative and computationally efficient prediction for fluorescence emission wavelength of organic molecules.
Bibliography:vs.
Lasso-RF comparison, the schematic representation of Lasso-RF model. See DOI
Electronic supplementary information (ESI) available: All of the descriptor categories from PaDEL, the selected compounds for DFT
10.1039/d0ra05014h
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2046-2069
2046-2069
DOI:10.1039/d0ra05014h