Natural Language Processing of Radiology Reports to Detect Complications of Ischemic Stroke

Background ion of critical data from unstructured radiologic reports using natural language processing (NLP) is a powerful tool to automate the detection of important clinical features and enhance research efforts. We present a set of NLP approaches to identify critical findings in patients with acu...

Full description

Saved in:

Bibliographic Details
Published in	Neurocritical care Vol. 37; no. Suppl 2; pp. 291 - 302
Main Authors	Miller, Matthew I., Orfanoudaki, Agni, Cronin, Michael, Saglam, Hanife, So Yeon Kim, Ivy, Balogun, Oluwafemi, Tzalidi, Maria, Vasilopoulos, Kyriakos, Fanaropoulou, Georgia, Fanaropoulou, Nina M., Kalin, Jack, Hutch, Meghan, Prescott, Brenton R., Brush, Benjamin, Benjamin, Emelia J., Shin, Min, Mian, Asim, Greer, David M., Smirnakis, Stelios M., Ong, Charlene J.
Format	Journal Article
Language	English
Published	New York Springer US 01.08.2022 Springer Nature B.V
Subjects	Algorithms Artificial intelligence Big Data in Neurocritical Care Classification Critical Care Medicine Datasets Edema Hematoma Humans Intensive Internal Medicine Ischemia Ischemic Stroke - diagnostic imaging Machine Learning Magnetic resonance imaging Medical imaging Medicine Medicine & Public Health Natural Language Processing Neurology Neurosurgery Radiology Stroke Critical care Stroke Natural language processing Diagnostic imaging
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Background ion of critical data from unstructured radiologic reports using natural language processing (NLP) is a powerful tool to automate the detection of important clinical features and enhance research efforts. We present a set of NLP approaches to identify critical findings in patients with acute ischemic stroke from radiology reports of computed tomography (CT) and magnetic resonance imaging (MRI). Methods We trained machine learning classifiers to identify categorical outcomes of edema, midline shift (MLS), hemorrhagic transformation, and parenchymal hematoma, as well as rule-based systems (RBS) to identify intraventricular hemorrhage (IVH) and continuous MLS measurements within CT/MRI reports. Using a derivation cohort of 2289 reports from 550 individuals with acute middle cerebral artery territory ischemic strokes, we externally validated our models on reports from a separate institution as well as from patients with ischemic strokes in any vascular territory. Results In all data sets, a deep neural network with pretrained biomedical word embeddings (BioClinicalBERT) achieved the highest discrimination performance for binary prediction of edema (area under precision recall curve [AUPRC] > 0.94), MLS (AUPRC > 0.98), hemorrhagic conversion (AUPRC > 0.89), and parenchymal hematoma (AUPRC > 0.76). BioClinicalBERT outperformed lasso regression ( p < 0.001) for all outcomes except parenchymal hematoma ( p = 0.755). Tailored RBS for IVH and continuous MLS outperformed BioClinicalBERT ( p < 0.001) and linear regression, respectively ( p < 0.001). Conclusions Our study demonstrates robust performance and external validity of a core NLP tool kit for identifying both categorical and continuous outcomes of ischemic stroke from unstructured radiographic text data. Medically tailored NLP methods have multiple important big data applications, including scalable electronic phenotyping, augmentation of clinical risk prediction models, and facilitation of automatic alert systems in the hospital setting.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 MIM, AO, CJO, and SMS conceived and designed the overall study. MIMM, AO, and BB wrote original computer code in Python to train and test all models. Individual radiology reports were reviewed and labeled by MIM, CJO, HS, OB, MT, KV, GF, NMF, and JK. MIM, AO, CJO, BRP, MH, and ISYK organized and administered collected data for modeling. AM is a practicing neuroradiologist who reviewed selected reports to benchmark interrater reliability. MS provided technical assistance with model training and development. MIM, MC, ISYK, and CJO wrote the manuscript. MIM, AO, MC, HS, ISYK, OB, MT, KV, GF, NMF, JK, MH, BRP, BB, EJB, MS, AM, DMG, SMS, and CJO helped with reviews and revision. DMG and EJB provided overall study direction and critical review. Author Contributions
ISSN:	1541-6933 1556-0961
DOI:	10.1007/s12028-022-01513-3