P311 Detection and characterisation of extra-intestinal manifestations of IBD in clinical office notes using natural language processing

Abstract Background Extra-Intestinal Manifestations (EIM) occur in nearly 40% of patients with IBD and impact both disease experience and therapeutic decision-making, but are not well captured by administrative codes. We aimed to pilot computational natural language processing (NLP) methods to chara...

Full description

Saved in:
Bibliographic Details
Published inJournal of Crohn's and colitis Vol. 14; no. Supplement_1; pp. S309 - S310
Main Authors Stidham, R, Yu, D, Lahiri, S, Vydiswaran, V
Format Journal Article
LanguageEnglish
Published US Oxford University Press 15.01.2020
Online AccessGet full text
ISSN1873-9946
1876-4479
DOI10.1093/ecco-jcc/jjz203.440

Cover

More Information
Summary:Abstract Background Extra-Intestinal Manifestations (EIM) occur in nearly 40% of patients with IBD and impact both disease experience and therapeutic decision-making, but are not well captured by administrative codes. We aimed to pilot computational natural language processing (NLP) methods to characterise EIMs using consultant notes. Methods Subjects with a diagnosis of IBD were identified in a single-centre retrospective review of electronic health records (EHR) between 2014–2017. Gastroenterology (GI) notes were annotated by two reviewers for the presence and activity of EIMs. EIM concepts were identified using NLP methods leveraging UMLS libraries and hand-crafted features. EIM characterisation occurred within a ±25-word window around identified EIMs with classifications including inactive concepts (negated, historical, resolved) and active concepts (improved, worsened, active but unchanged). Decisions on EIM status when repeatedly referenced in a document used section-based weighting for status inference, with greatest to least weight ranking for assessment/plan, subjective, past history, exam, and other, respectively. EIM status was classified as ambiguous when multiple conflicting references were present within the same document of approximately equal weight. Model development and testing used an 80/20 dataset split. Results In 4108 unique IBD patients, 1640 (39.9%) had at least 1 EIM identified. The mean age was 41.9 years, 47.2% were male, and 27.0% had biologic exposure. A total of 1240 manually annotated documents (first GI notes) were comprised of 51.1% arthritis, 16.5% ocular, 16.2% psoriasis, with erythema nodosum (EN), pyoderma gangrenosum (PG), and hidradenitis suppurativa (HS) together comprising 16.2% of the cohort. NLP models performed well for correctly classifying both EIM presence and status in a testing set, with overall accuracy, sensitivity, and specificity of 91.2%, 92.9% and 81.8% across all EIMs in notes automatically classified as non-ambiguous (Table 1). NLP methods identified EIM status classification as ambiguous in 38.9% of cases. Table 1. Performance of an NLP system to detect and classify extra-intestinal manifestations of IBD using consultant notes Accuracy Spec Sens Ambiguous Total Arthritis 90.5% 93.2% 69.2% 37.3% 185 Ocular 96.3% 100.0% 87.5% 54.2% 59 Psoriasis 87.5% 87.9% 85.7% 32.2% 59 EN 100.0% 100.0% 100.0% 15.4% 25 PG 100.0% 100.0% 100.0% 42.9% 14 HS 100.0% 100.0% 100.0% 61.5% 13 All EIMs 91.2% 92.9% 81.8% 38.9% 355 Conclusion NLP methods can detect and classify EIMs with reasonable performance and efficiency compared with traditional manual chart review. Though source document variation and ambiguity present challenges, NLP offers exciting possibilities for population-based research and decision support.
ISSN:1873-9946
1876-4479
DOI:10.1093/ecco-jcc/jjz203.440