AD-BERT: Using pre-trained language model to predict the progression from mild cognitive impairment to Alzheimer's disease

[Display omitted] •We developed a deep learning framework called AD-BERT to predict the risk of MCI-to-AD progression using unstructured clinical notes from EHRs. We released the pre-trained model to hugging face https://huggingface.co/mocherson/AD-BERT/tree/main.•We validated AD-BERT on two real da...

Full description

Saved in:
Bibliographic Details
Published inJournal of biomedical informatics Vol. 144; p. 104442
Main Authors Mao, Chengsheng, Xu, Jie, Rasmussen, Luke, Li, Yikuan, Adekkanattu, Prakash, Pacheco, Jennifer, Bonakdarpour, Borna, Vassar, Robert, Shen, Li, Jiang, Guoqian, Wang, Fei, Pathak, Jyotishman, Luo, Yuan
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.08.2023
Subjects
Online AccessGet full text
ISSN1532-0464
1532-0480
1532-0480
DOI10.1016/j.jbi.2023.104442

Cover

Loading…
More Information
Summary:[Display omitted] •We developed a deep learning framework called AD-BERT to predict the risk of MCI-to-AD progression using unstructured clinical notes from EHRs. We released the pre-trained model to hugging face https://huggingface.co/mocherson/AD-BERT/tree/main.•We validated AD-BERT on two real datasets and demonstrated its effectiveness for MCI-to-AD prediction, showing the utility of pre-trained language models and clinical notes in predicting MCI-to-AD progression, which could have important implications for improving early detection and intervention for AD.•We applied a stratified batch sampler to address the class imbalance problem for batch training, ensuring that all batches have an equal ratio of case and control samples. We develop a deep learning framework based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model using unstructured clinical notes from electronic health records (EHRs) to predict the risk of disease progression from Mild Cognitive Impairment (MCI) to Alzheimer's Disease (AD). We identified 3657 patients diagnosed with MCI together with their progress notes from Northwestern Medicine Enterprise Data Warehouse (NMEDW) between 2000 and 2020. The progress notes no later than the first MCI diagnosis were used for the prediction. We first preprocessed the notes by deidentification, cleaning and splitting into sections, and then pre-trained a BERT model for AD (named AD-BERT) based on the publicly available Bio+Clinical BERT on the preprocessed notes. All sections of a patient were embedded into a vector representation by AD-BERT and then combined by global MaxPooling and a fully connected network to compute the probability of MCI-to-AD progression. For validation, we conducted a similar set of experiments on 2563 MCI patients identified at Weill Cornell Medicine (WCM) during the same timeframe. Compared with the 7 baseline models, the AD-BERT model achieved the best performance on both datasets, with Area Under receiver operating characteristic Curve (AUC) of 0.849 and F1 score of 0.440 on NMEDW dataset, and AUC of 0.883 and F1 score of 0.680 on WCM dataset. The use of EHRs for AD-related research is promising, and AD-BERT shows superior predictive performance in modeling MCI-to-AD progression prediction. Our study demonstrates the utility of pre-trained language models and clinical notes in predicting MCI-to-AD progression, which could have important implications for improving early detection and intervention for AD.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1532-0464
1532-0480
1532-0480
DOI:10.1016/j.jbi.2023.104442