Machine learning on drug-specific data to predict small molecule teratogenicity

Pregnant women are an especially vulnerable population, given the sensitivity of a developing fetus to chemical exposures. However, prescribing behavior for the gravid patient is guided on limited human data and conflicting cases of adverse outcomes due to the exclusion of pregnant populations from...

Full description

Saved in:

Bibliographic Details
Published in	bioRxiv
Main Authors	Challa, Anup P, Beam, Andrew L, Shen, Min, Peryea, Tyler, Lavieri, Robert R, Lippmann, Ethan S, Aronoff, David M
Format	Paper
Language	English
Published	Cold Spring Harbor Cold Spring Harbor Laboratory Press 30.11.2019
Subjects	Artificial intelligence Biological activity Clinical trials Fetuses Learning algorithms Machine learning Pregnancy Structure-activity relationships Teratogenicity
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Pregnant women are an especially vulnerable population, given the sensitivity of a developing fetus to chemical exposures. However, prescribing behavior for the gravid patient is guided on limited human data and conflicting cases of adverse outcomes due to the exclusion of pregnant populations from randomized, controlled trials. These factors increase risk for adverse drug outcomes and reduce quality of care for pregnant populations. Herein, we propose the application of artificial intelligence to systematically predict the teratogenicity of a prescriptible small molecule from information inherent to the drug. Using unsupervised and supervised machine learning, our model probes all small molecules with known structure and teratogenicity data published in research-amenable formats to identify patterns among structural, meta-structural, and in vitro bioactivity data for each drug and its teratogenicity score. With this workflow, we discovered three chemical functionalities that predispose a drug towards increased teratogenicity and two moieties with potentially protective effects. Our models predict three clinically-relevant classes of teratogenicity with AUC = 0.8 and nearly double the predictive accuracy of a blind control for the same task, suggesting successful modeling. We also present extensive barriers to translational research that restrict data-driven studies in pregnancy and therapeutically 'orphan' pregnant populations. Collectively, this work represents a first-in-kind platform for the application of computing to study and predict teratogenicity. Footnotes * https://github.com/apchalla/teratogenicity-qsar.
DOI:	10.1101/860627