P269 Harnessing Natural Language Processing for Structured Information Extraction from Radiology Reports in Crohn's Disease: A Nationwide Study From the epi-IIRN
Abstract Background Large-scale analyses of imaging in Crohn's disease (CD) hold promise for advancing research on the inflammatory burden in the bowel as well as developing predictive models of disease progression. However, the lack of structured human annotations in these datasets limits the...
Saved in:
Published in | Journal of Crohn's and colitis Vol. 18; no. Supplement_1; pp. i633 - i634 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
24.01.2024
|
Online Access | Get full text |
Cover
Loading…
Summary: | Abstract
Background
Large-scale analyses of imaging in Crohn's disease (CD) hold promise for advancing research on the inflammatory burden in the bowel as well as developing predictive models of disease progression. However, the lack of structured human annotations in these datasets limits the ability use these for research. This study from epi-IIRN nationwide cohort, aims to develop and evaluate Natural Language Processing (NLP) algorithms for extracting structured information from unstructured radiology reports on CT and MR enterography.
Methods
We identified radiology reports of all patients diagnosed with inflammatory Bowel Disease (IBD) in the nationwide epi-IIRN cohort, which includes data from the four Israeli HMOs, covering 98% of the population. We developed an in-house NLP platform using a publicly available Hebrew pretrained BERT model. After annotating a small portion of the reports for validation, we fine-tuned the model on our dataset through a masked language task, followed by a few-shot approach based on the Next Sentence Prediction (NSP) pretraining objective for classification model fine-tuning. The platform extracts radiological indicators related to inflammation, stenosis, and location, including wall thickening, enhancement, lumen narrowing, and dilation in the following locations: jejunum, ileum, colon, sigmoid, and rectum. We validated our models using a 5-fold cross-validation experimental setup, employing accuracy, PPV, NPV, F1 score and Cohen’s kappa score as the evaluation metrics.
Results
We extracted 9,704 radiology reports (6,299 MRE, 2,405 CTE) of 7,062 IBD patients (5,972 were diagnosed with CD, and 1,076 with ulcerative colitis). The mean age on the first imaging study was 36.8±17.1 years and 52% were male. We selected 500 studies for being annotated for the radiological indicators. The most common label was wall thickening in the ileum (215 positive patients vs.285 negative) while the least common was lumen narrowing in the jejunum (1 positive patient vs. 499 negative). Table 1 summarizes the results and label distributions. The mean [95% CI] accuracy/PPV/NPV/F1/Cohen's kappa score averaged over all labels was 0.98 [0.95,1]/0.99 [0.96,1]/0.83 [0.56,1]/0.86 [0.6,1]/0.63 [0.16,1]. The labels with the highest F1/Cohen's kappa score were wall thickening, enhancement, and narrowing in the Ileum while the label with the lowest F1/Cohen's kappa score were dilation in the Colon and the Jejunum.
Conclusion
NLP methods can extract structured information from radiology reports with high accuracy. Few-shot approaches based on the Next Sentence Prediction can alleviate the need for large scale data annotation for training. NLP offers exciting possibilities for large-scale studies utilizing imaging data in CD. |
---|---|
ISSN: | 1873-9946 1876-4479 |
DOI: | 10.1093/ecco-jcc/jjad212.0399 |