Estimation of logistic regression with covariates missing separately or simultaneously via multiple imputation methods

Logistic regression is a standard model in many studies of binary outcome data, and the analysis of missing data in this model is a fascinating topic. Based on the idea of Wang D, Chen SX (2009) Empirical likelihood for estimating equations with missing values. Ann Stat, 37:490–517, proposed are two...

Full description

Saved in:
Bibliographic Details
Published inComputational statistics Vol. 38; no. 2; pp. 899 - 934
Main Authors Lee, Shen-Ming, Le, Truong-Nhat, Tran, Phuoc-Loc, Li, Chin-Shang
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2023
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0943-4062
1613-9658
DOI10.1007/s00180-022-01250-3

Cover

Loading…
More Information
Summary:Logistic regression is a standard model in many studies of binary outcome data, and the analysis of missing data in this model is a fascinating topic. Based on the idea of Wang D, Chen SX (2009) Empirical likelihood for estimating equations with missing values. Ann Stat, 37:490–517, proposed are two different types of multiple imputation (MI) estimation methods, which each use three empirical conditional distribution functions to generate random values to impute missing data, to estimate the parameters of logistic regression with covariates missing at random (MAR) separately or simultaneously by using the estimating equations of Fay RE (1996) Alternative paradigms for the analysis of imputed survey data. J Am Stat Assoc, 91:490–498. The derivation of the two proposed MI estimation methods is under the assumption of MAR separately or simultaneously and exclusively for categorical/discrete data. The two proposed methods are computationally effective, as evidenced by simulation studies. They have a quite similar efficiency and outperform the complete-case, semiparametric inverse probability weighting, validation likelihood, and random forest MI by chained equations methods. Although the two proposed methods are comparable with the joint conditional likelihood (JCL) method, they have more straightforward calculations and shorter computing times compared to the JCL and MICE methods. Two real data examples are used to illustrate the applicability of the proposed methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0943-4062
1613-9658
DOI:10.1007/s00180-022-01250-3