The role of different sampling methods in improving biological activity prediction using deep belief network

Thousands of molecules and descriptors are available for a medicinal chemist thanks to the technological advancements in different branches of chemistry. This fact as well as the correlation between them has raised new problems in quantitative structure activity relationship studies. Proper paramete...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational chemistry Vol. 38; no. 4; pp. 195 - 203
Main Authors Ghasemi, Fahimeh, Fassihi, Afshin, Pérez‐Sánchez, Horacio, Mehri Dehnavi, Alireza
Format Journal Article
LanguageEnglish
Published United States Wiley Subscription Services, Inc 05.02.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Thousands of molecules and descriptors are available for a medicinal chemist thanks to the technological advancements in different branches of chemistry. This fact as well as the correlation between them has raised new problems in quantitative structure activity relationship studies. Proper parameter initialization in statistical modeling has merged as another challenge in recent years. Random selection of parameters leads to poor performance of deep neural network (DNN). In this research, deep belief network (DBN) was applied to initialize DNNs. DBN is composed of some stacks of restricted Boltzmann machine, an energy‐based method that requires computing log likelihood gradient for all samples. Three different sampling approaches were suggested to solve this gradient. In this respect, the impact of DBN was applied based on the different sampling approaches mentioned above to initialize the DNN architecture in predicting biological activity of all fifteen Kaggle targets that contain more than 70k molecules. The same as other fields of processing research, the outputs of these models demonstrated significant superiority to that of DNN with random parameters. © 2016 Wiley Periodicals, Inc. High throughput virtual screening is a kind of computational approach in drug discovery challenged many fundamental problems such as prone to over‐fitting. The novel solution to avoid them is pre‐training network parameters by deep belief network (DBN). The main problem in applying DBN is calculation of log likelihood gradient. Different sampling approaches have been suggested. The results of this study demonstrated that DBN could improve the ability of DNN to provide high quality predicting models.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:0192-8651
1096-987X
1096-987X
DOI:10.1002/jcc.24671