Deep learning for molecular design-a review of the state of the art

In the space of only a few years, deep generative modeling has revolutionized how we think of artificial creativity, yielding autonomous systems which produce original images, music, and text. Inspired by these successes, researchers are now applying deep generative modeling techniques to the genera...

Full description

Saved in:
Bibliographic Details
Published inMolecular systems design & engineering Vol. 4; no. 4; pp. 828 - 849
Main Authors Elton, Daniel C, Boukouvalas, Zois, Fuge, Mark D, Chung, Peter W
Format Journal Article
LanguageEnglish
Published Cambridge Royal Society of Chemistry 05.08.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In the space of only a few years, deep generative modeling has revolutionized how we think of artificial creativity, yielding autonomous systems which produce original images, music, and text. Inspired by these successes, researchers are now applying deep generative modeling techniques to the generation and optimization of molecules-in our review we found 45 papers on the subject published in the past two years. These works point to a future where such systems will be used to generate lead molecules, greatly reducing resources spent downstream synthesizing and characterizing bad leads in the lab. In this review we survey the increasingly complex landscape of models and representation schemes that have been proposed. The four classes of techniques we describe are recursive neural networks, autoencoders, generative adversarial networks, and reinforcement learning. After first discussing some of the mathematical fundamentals of each technique, we draw high level connections and comparisons with other techniques and expose the pros and cons of each. Several important high level themes emerge as a result of this work, including the shift away from the SMILES string representation of molecules towards more sophisticated representations such as graph grammars and 3D representations, the importance of reward function design, the need for better standards for benchmarking and testing, and the benefits of adversarial training and reinforcement learning over maximum likelihood based training. We review a recent groundswell of work which uses deep learning techniques to generate and optimize molecules.
Bibliography:Dr. Daniel C. Elton received a B.S. degree in physics from Rensselaer Polytechnic Institute in 2009 and a Ph.D. in physics from Stony Brook University in 2016. He worked as a Postdoctoral Research Associate and later an Assistant Research Scientist at the University of Maryland, College Park between 2017-2019 where he focused on applications of deep learning to molecular design and discovery. In January 2019 he moved to work as a contractor Staff Scientist at the National Institutes of Health. His current work focuses on applications of deep learning and AI to detection and segmentation in medical images.
Dr. Mark D. Fuge is an Assistant Professor of Mechanical Engineering at the University of Maryland, College Park. His staff and students study fundamental scientific and mathematical questions behind how humans and computers can work together to design better complex engineered systems, from the molecular scale to systems as large as aircraft and ships, by using tools from Applied Mathematics and Computer Science. He received his Ph.D. from UC Berkeley and has received a DARPA Young Faculty Award, a National Defense Science and Engineering Graduate Fellowship, and has prior/current support from NSF, NIH, DARPA, ONR, and Lockheed Martin.
Dr. Peter W. Chung is an Associate Professor in the Department of Mechanical Engineering at the University of Maryland in College Park. He serves as the Division Lead of the Mechanics, Materials, and Manufacturing Division within the department and is also the Lead of the Energetics Group in the Center for Engineering Concepts Development.
Dr. Zois Boukouvalas received his B.S. degree in Mathematics from the University of Patras, Greece, an M.S. degree in Applied and Computational Mathematics from the Rochester Institute of Technology, and a Ph.D. degree in Applied Mathematics from University of Maryland Baltimore County in 2017. Since 2017 he has worked as a Postdoctoral Research Associate at the University of Maryland, College Park, and in August 2019 he will start as an Assistant Professor in the Department of Mathematics and Statistics at American University. His research interests include blind source separation and machine learning for big data problems.
ISSN:2058-9689
2058-9689
DOI:10.1039/c9me00039a