BioBridge: Unified Bio-Embedding With Bridging Modality in Code-Switched EMR

Pediatric Emergency Department (PED) overcrowding presents a significant global challenge, prompting the need for efficient solutions. This paper introduces the BioBridge framework, a novel approach that applies Natural Language Processing (NLP) to Electronic Medical Records (EMRs) in written free-t...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 12; pp. 141866 - 141877
Main Authors	Jeon, Jangyeong, Cho, Sangyeon, Lee, Dongjoon, Lee, Changhee, Kim, Junyeong
Format	Journal Article
Language	English
Published	Piscataway IEEE 2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Biological system modeling code-switching Coders Context Electronic health records electronic medical record Electronic medical records Embedding emergency department Emergency medical services Emergency services Encoding Feature extraction Machine learning Modules Natural language processing pediatric emergency department Pediatrics Source code Training Unified modeling language
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Pediatric Emergency Department (PED) overcrowding presents a significant global challenge, prompting the need for efficient solutions. This paper introduces the BioBridge framework, a novel approach that applies Natural Language Processing (NLP) to Electronic Medical Records (EMRs) in written free-text form to enhance decision-making in PED. In non-English speaking countries, such as South Korea, EMR data is often written in a Code-Switching(CS) format that mixes the native language with English, with most code-switched English words having clinical significance. The BioBridge framework consists of two core modules: "bridging modality in context" and "unified bio-embedding." The "bridging modality in context" module improves the contextual understanding of bilingual and code-switched EMRs. In the "unified bio-embedding" module, the knowledge of the model trained in the medical domain is injected into the encoder-based model to bridge the gap between the medical and general domains. Experimental results demonstrate that the proposed BioBridge significantly performance traditional machine learning and pre-trained encoder-based models on several metrics, including F1 score, area under the receiver operating characteristic curve (AUROC), area under the precision-recall Curve (AUPRC), and Brier score. Specifically, BioBridge-XLM achieved enhancements of 0.85% in F1 score, 0.75% in AUROC, and 0.76% in AUPRC, along with a notable 3.04% decrease in the Brier score, demonstrating marked improvements in accuracy, reliability, and prediction calibration over the baseline XLM model. The source code will be made publicly available at https://github.com/jjy961228/BioBridge .
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2024.3467251