HarmoSATE: Harmonized embedding-based self-attentive encoder to improve accuracy of privacy-preserving federated predictive analysis

Accurate privacy-preserving prediction using electronic health record (EHR) data distributed in multiple hospitals is essential to enable stakeholders related to healthcare services to obtain useful information without privacy leakage. In this paper, we propose harmonized embedding-based self-attent...

Full description

Saved in:
Bibliographic Details
Published inInformation sciences Vol. 662; p. 120265
Main Authors Lee, Taek-Ho, Kim, Suhyeon, Lee, Junghye, Jun, Chi-Hyuck
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.03.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Accurate privacy-preserving prediction using electronic health record (EHR) data distributed in multiple hospitals is essential to enable stakeholders related to healthcare services to obtain useful information without privacy leakage. In this paper, we propose harmonized embedding-based self-attentive encoder (HarmoSATE), which is a new method for privacy-preserving federated predictive analysis. We extract contextual embeddings of local institutions using Word2Vec, and then harmonize locally-trained embeddings using a neural network-based harmonization technique. The proposed method uses a deep representative encoder based on self-attention to learn complex and dynamic patterns inherent to harmonized embeddings of medical concepts. To evaluate our method, we implemented experiments using sequential medical codes collected from the Medical Information Mart for Intensive Care-III dataset in a distributed setting. It achieved a significant increase in average AUC, ranging from 3% to 8% depending on the experiments compared to baseline models, demonstrating superior prediction accuracy of a patient's diagnosis in the next admission. HarmoSATE can be a useful alternative to obtain accurate and practical results for various predictive tasks that use sensitive and distributed EHR data while preserving patients' privacy.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2024.120265