Active learning for extracting rare adverse events from electronic health records: A study in pediatric cardiology

[Display omitted] Automate the extraction of adverse events from the text of electronic medical records of patients hospitalized for cardiac catheterization. We focused on events related to cardiac catheterization as defined by the NCDR-IMPACT registry. These events were extracted from the Necker Ch...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of medical informatics (Shannon, Ireland) Vol. 195; p. 105761
Main Authors Quennelle, Sophie, Malekzadeh-Milani, Sophie, Garcelon, Nicolas, Faour, Hassan, Burgun, Anita, Faviez, Carole, Tsopra, Rosy, Bonnet, Damien, Neuraz, Antoine
Format Journal Article
LanguageEnglish
Published Ireland Elsevier B.V 01.03.2025
Elsevier
SeriesStudies in Health Technology and Informatics
Subjects
Online AccessGet full text
ISSN1386-5056
1872-8243
1872-8243
DOI10.1016/j.ijmedinf.2024.105761

Cover

Loading…
More Information
Summary:[Display omitted] Automate the extraction of adverse events from the text of electronic medical records of patients hospitalized for cardiac catheterization. We focused on events related to cardiac catheterization as defined by the NCDR-IMPACT registry. These events were extracted from the Necker Children’s Hospital data warehouse. Electronic health records were pre-screened using regular expressions. The resulting datasets contained numerous false positives sentences that were annotated by a cardiologist using an active learning process. A deep learning text classifier was then trained on this active learning-annotated dataset to accurately identify patients who have suffered a serious adverse event. The dataset included 2,980 patients. Regular expression based extraction of adverse events related to cardiac catheterization achieved a perfect recall. Due to the rarity of adverse events, the dataset obtained from this initial pre-screening step was imbalanced, containing a significant number of false positives. The active learning annotation enabled the acquisition of a representative dataset suitable for training a deep learning model. The deep learning text-classifier identified patients who underwent adverse events after cardiac catheterization with a recall of 0.78 and a specificity of 0.94. Our model effectively identified patients who experienced adverse events related to cardiac catheterization using real clinical data. Enabled by an active learning annotation process, it shows promise for large language model applications in clinical research, especially for rare diseases with limited annotated databases. Our model’s strength lies in its development by physicians for physicians, ensuring its relevance and applicability in clinical practice.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1386-5056
1872-8243
1872-8243
DOI:10.1016/j.ijmedinf.2024.105761