Multiple Imputation via Generative Adversarial Network for High-dimensional Blockwise Missing Value Problems

Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis. As the gold standard of handling missing data, multiple imputation (MI) methods are proposed to account for the imputation uncert...

Full description

Saved in:
Bibliographic Details
Main Authors Dai, Zongyu, Bu, Zhiqi, Long, Qi
Format Journal Article
LanguageEnglish
Published 21.12.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis. As the gold standard of handling missing data, multiple imputation (MI) methods are proposed to account for the imputation uncertainty and provide proper statistical inference. In this work, we propose Multiple Imputation via Generative Adversarial Network (MI-GAN), a deep learning-based (in specific, a GAN-based) multiple imputation method, that can work under missing at random (MAR) mechanism with theoretical support. MI-GAN leverages recent progress in conditional generative adversarial neural works and shows strong performance matching existing state-of-the-art imputation methods on high-dimensional datasets, in terms of imputation error. In particular, MI-GAN significantly outperforms other imputation methods in the sense of statistical inference and computational speed.
DOI:10.48550/arxiv.2112.11507