Deep learning predicts extreme preterm birth from electronic health records
[Display omitted] •Extreme preterm birth (EPB) accounts for the majority of newborn deaths.•Deep learning models that consider temporal relations can predict EPB.•Deep learning ensemble models achieve a higher performance than individual models.•EPB is associated with significant morbidity, e.g., sy...
Saved in:
Published in | Journal of biomedical informatics Vol. 100; p. 103334 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Elsevier Inc
01.12.2019
|
Subjects | |
Online Access | Get full text |
ISSN | 1532-0464 1532-0480 1532-0480 |
DOI | 10.1016/j.jbi.2019.103334 |
Cover
Loading…
Summary: | [Display omitted]
•Extreme preterm birth (EPB) accounts for the majority of newborn deaths.•Deep learning models that consider temporal relations can predict EPB.•Deep learning ensemble models achieve a higher performance than individual models.•EPB is associated with significant morbidity, e.g., systemic lupus erythematosus.
Models for predicting preterm birth generally have focused on very preterm (28–32 weeks) and moderate to late preterm (32–37 weeks) settings. However, extreme preterm birth (EPB), before the 28th week of gestational age, accounts for the majority of newborn deaths. We investigated the extent to which deep learning models that consider temporal relations documented in electronic health records (EHRs) can predict EPB.
EHR data were subject to word embedding and a temporal deep learning model, in the form of recurrent neural networks (RNNs) to predict EPB. Due to the low prevalence of EPB, the models were trained on datasets where controls were undersampled to balance the case-control ratio. We then applied an ensemble approach to group the trained models to predict EPB in an evaluation setting with a nature EPB ratio. We evaluated the RNN ensemble models with 10 years of EHR data from 25,689 deliveries at Vanderbilt University Medical Center. We compared their performance with traditional machine learning models (logistical regression, support vector machine, gradient boosting) trained on the datasets with balanced and natural EPB ratio. Risk factors associated with EPB were identified using an adjusted odds ratio.
The RNN ensemble models trained on artificially balanced data achieved a higher AUC (0.827 vs. 0.744) and sensitivity (0.965 vs. 0.682) than those RNN models trained on the datasets with naturally imbalanced EPB ratio. In addition, the AUC (0.827) and sensitivity (0.965) of the RNN ensemble models were better than the AUC (0.777) and sensitivity (0.819) of the best baseline models trained on balanced data. Also, risk factors, including twin pregnancy, short cervical length, hypertensive disorder, systemic lupus erythematosus, and hydroxychloroquine sulfate, were found to be associated with EPB at a significant level.
Temporal deep learning can predict EPB up to 8 weeks earlier than its occurrence. Accurate prediction of EPB may allow healthcare organizations to allocate resources effectively and ensure patients receive appropriate care. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 CG performed the data collection and analysis, method design, experimental design, evaluation and interpretation of the results, and drafting and revising of the manuscript. SO, DE and GJ performed evaluation and interpretation of the results and revising of the manuscript. BM performed method design, experimental design, evaluation and interpretation of the results, and revising of the manuscript. YC performed the data collection and analysis, methods design, experimental design, evaluation and interpretation of the results, and drafting and revising of the manuscript. Contributors |
ISSN: | 1532-0464 1532-0480 1532-0480 |
DOI: | 10.1016/j.jbi.2019.103334 |