P311 Detection and characterisation of extra-intestinal manifestations of IBD in clinical office notes using natural language processing
Abstract Background Extra-Intestinal Manifestations (EIM) occur in nearly 40% of patients with IBD and impact both disease experience and therapeutic decision-making, but are not well captured by administrative codes. We aimed to pilot computational natural language processing (NLP) methods to chara...
Saved in:
Published in | Journal of Crohn's and colitis Vol. 14; no. Supplement_1; pp. S309 - S310 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
US
Oxford University Press
15.01.2020
|
Online Access | Get full text |
ISSN | 1873-9946 1876-4479 |
DOI | 10.1093/ecco-jcc/jjz203.440 |
Cover
Summary: | Abstract
Background
Extra-Intestinal Manifestations (EIM) occur in nearly 40% of patients with IBD and impact both disease experience and therapeutic decision-making, but are not well captured by administrative codes. We aimed to pilot computational natural language processing (NLP) methods to characterise EIMs using consultant notes.
Methods
Subjects with a diagnosis of IBD were identified in a single-centre retrospective review of electronic health records (EHR) between 2014–2017. Gastroenterology (GI) notes were annotated by two reviewers for the presence and activity of EIMs. EIM concepts were identified using NLP methods leveraging UMLS libraries and hand-crafted features. EIM characterisation occurred within a ±25-word window around identified EIMs with classifications including inactive concepts (negated, historical, resolved) and active concepts (improved, worsened, active but unchanged). Decisions on EIM status when repeatedly referenced in a document used section-based weighting for status inference, with greatest to least weight ranking for assessment/plan, subjective, past history, exam, and other, respectively. EIM status was classified as ambiguous when multiple conflicting references were present within the same document of approximately equal weight. Model development and testing used an 80/20 dataset split.
Results
In 4108 unique IBD patients, 1640 (39.9%) had at least 1 EIM identified. The mean age was 41.9 years, 47.2% were male, and 27.0% had biologic exposure. A total of 1240 manually annotated documents (first GI notes) were comprised of 51.1% arthritis, 16.5% ocular, 16.2% psoriasis, with erythema nodosum (EN), pyoderma gangrenosum (PG), and hidradenitis suppurativa (HS) together comprising 16.2% of the cohort. NLP models performed well for correctly classifying both EIM presence and status in a testing set, with overall accuracy, sensitivity, and specificity of 91.2%, 92.9% and 81.8% across all EIMs in notes automatically classified as non-ambiguous (Table 1). NLP methods identified EIM status classification as ambiguous in 38.9% of cases.
Table 1.
Performance of an NLP system to detect and classify extra-intestinal manifestations of IBD using consultant notes
Accuracy
Spec
Sens
Ambiguous
Total
Arthritis
90.5%
93.2%
69.2%
37.3%
185
Ocular
96.3%
100.0%
87.5%
54.2%
59
Psoriasis
87.5%
87.9%
85.7%
32.2%
59
EN
100.0%
100.0%
100.0%
15.4%
25
PG
100.0%
100.0%
100.0%
42.9%
14
HS
100.0%
100.0%
100.0%
61.5%
13
All EIMs
91.2%
92.9%
81.8%
38.9%
355
Conclusion
NLP methods can detect and classify EIMs with reasonable performance and efficiency compared with traditional manual chart review. Though source document variation and ambiguity present challenges, NLP offers exciting possibilities for population-based research and decision support. |
---|---|
ISSN: | 1873-9946 1876-4479 |
DOI: | 10.1093/ecco-jcc/jjz203.440 |