Descriptor Selection Improvements for Quantitative Structure-Activity Relationships

Molecular descriptor selection is an essential procedure to improve a predictive quantitative structure-activity relationship (QSAR) model. However, within the QSAR model, there are a number of redundant, noisy and irrelevant descriptors. In this study, we propose a novel descriptor selection framew...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of neural systems Vol. 29; no. 9; p. 1950016
Main Authors Xia, Liang-Yong, Wang, Qing-Yong, Cao, Zehong, Liang, Yong
Format Journal Article
LanguageEnglish
Published Singapore 01.11.2019
Subjects
Online AccessGet more information

Cover

Loading…
More Information
Summary:Molecular descriptor selection is an essential procedure to improve a predictive quantitative structure-activity relationship (QSAR) model. However, within the QSAR model, there are a number of redundant, noisy and irrelevant descriptors. In this study, we propose a novel descriptor selection framework using self-paced learning (SPL) via sparse logistic regression (LR) with Logsum penalty (SPL-Logsum), which can simultaneously adaptively identify the simple and complex samples and avoid over-fitting. SPL is inspired by the learning process of humans or animals gradually learned from simple and complex samples to train models, and the Logsum penalized LR helps to select a small subset of significant molecular descriptors for improving the QSAR models. Experimental results on some simulations and three public QSAR datasets show that our proposed SPL-Logsum framework outperforms other existing sparse methods regarding the area under the curve, sensitivity, specificity, accuracy, and -values.
ISSN:1793-6462
DOI:10.1142/S0129065719500163