Multiple Imputation for Model Checking: Completed-Data Plots with Missing and Latent Data

In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset-corresponding to the observed data and imputed unobserved data-using standard procedures for complete-data inference. Here, we extend th...

Full description

Saved in:

Bibliographic Details
Published in	Biometrics Vol. 61; no. 1; pp. 74 - 85
Main Authors	Gelman, Andrew, Van Mechelen, Iven, Verbeke, Geert, Heitjan, Daniel F., Meulders, Michel
Format	Journal Article
Language	English
Published	350 Main Street , Malden , MA 02148 , U.S.A , and P.O. Box 1354, 9600 Garsington Road , Oxford OX4 2DQ , U.K Blackwell Publishing 01.03.2005 International Biometric Society Blackwell Publishing Ltd
Subjects	Animals Bayes Theorem Bayesian model checking Clinical Trials as Topic - statistics & numerical data Data Collection Data imputation Data Interpretation, Statistical Data models Datasets Dosage Exploratory data analysis Histograms Humans Missing data Modeling Models, Statistical Multiple imputation Nonresponse Patient Dropouts - statistics & numerical data Posterior predictive checks Predictive modeling Rats Realized discrepancies Residuals School dropouts Symptoms
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset-corresponding to the observed data and imputed unobserved data-using standard procedures for complete-data inference. Here, we extend this approach to model checking by demonstrating the advantages of the use of completed-data model diagnostics on imputed completed datasets. The approach is set in the theoretical framework of Bayesian posterior predictive checks (but, as with missing-data imputation, our methods of missing-data model checking can also be interpreted as "predictive inference" in a non-Bayesian context). We consider the graphical diagnostics within this framework. Advantages of the completed-data approach include: (1) One can often check model fit in terms of quantities that are of key substantive interest in a natural way, which is not always possible using observed data alone. (2) In problems with missing data, checks may be devised that do not require to model the missingness or inclusion mechanism; the latter is useful for the analysis of ignorable but unknown data collection mechanisms, such as are often assumed in the analysis of sample surveys and observational studies. (3) In many problems with latent data, it is possible to check qualitative features of the model (for example, independence of two variables) that can be naturally formalized with the help of the latent data. We illustrate with several applied examples.
Bibliography:	ark:/67375/WNG-DWF1L7GZ-T ArticleID:BIOM031010 istex:3EBDB3DA6C471B4AC4FD1D77701E77A86F9EE023 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0006-341X 1541-0420
DOI:	10.1111/j.0006-341X.2005.031010.x