Attention-Based Imputation of Missing Values in Electronic Health Records Tabular Data

The imputation of missing values (IMV) in electronic health records tabular data is crucial to enable machine learning for patient-specific predictive modeling. While IMV methods are developed in biostatistics and recently in machine learning, deep learning-based solutions have shown limited success...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE 12th International Conference on Healthcare Informatics (ICHI) Vol. 2024; pp. 177 - 182
Main Authors Kowsar, Ibna, Rabbani, Shourav B., Samad, Manar D.
Format Conference Proceeding Journal Article
LanguageEnglish
Published United States IEEE 01.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The imputation of missing values (IMV) in electronic health records tabular data is crucial to enable machine learning for patient-specific predictive modeling. While IMV methods are developed in biostatistics and recently in machine learning, deep learning-based solutions have shown limited success in learning tabular data. This paper proposes a novel attention-based missing value imputation framework that learns to reconstruct data with missing values leveraging between-feature (self-attention) or between-sample attentions. We adopt data manipulation methods used in contrastive learning to improve the generalization of the trained imputation model. The proposed self-attention imputation method outperforms state-of-the-art statistical and machine learning-based (decision-tree) imputation methods, reducing the normalized root mean squared error by 18.4% to 74.7% on five tabular data sets and 52.6% to 82.6 % on two electronic health records data sets. The proposed attention-based missing value imputation method shows superior performance across a wide range of missingness (10 % to 50 %) when the values are missing completely at random.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2575-2626
2575-2634
DOI:10.1109/ICHI61247.2024.00030