Multiple Outputation: Inference for Complex Clustered Data by Averaging Analyses from Independent Data

This article applies a simple method for settings where one has clustered data, but statistical methods are only available for independent data. We assume the statistical method provides us with a normally distributed estimate, , and an estimate of its variance . We randomly select a data point from...

Full description

Saved in:

Bibliographic Details
Published in	Biometrics Vol. 59; no. 2; pp. 420 - 429
Main Authors	Follmann, Dean, Proschan, Michael, Leifer, Eric
Format	Journal Article
Language	English
Published	350 Main Street , Malden , MA 02148 , U.S.A , and P.O. Box 1354, 9600 Garsington Road , Oxford OX4 2DQ , U.K Blackwell Publishing 01.06.2003 International Biometric Society
Subjects	Acute Disease Animals Applied statistics Bayes Theorem Bayesian theory Biometrics biometry Bootstrap Cluster Analysis Consultant's Forum Correlations Covariance Data Interpretation, Statistical Data models Electrocardiography, Ambulatory Female Fetal Death Generalized estimating equations Generalized linear mixed models genetics Heart - anatomy & histology Humans Inference Ischemia - epidemiology Leukemia, Myeloid - immunology Leukemia, Myeloid - therapy linear models Male Multiple imputation P values Platelet Count Platelets Polymorphism, Genetic Pregnancy Radiation Resampling Sample variance statistical analysis Statistical variance Time Factors variance Within-cluster resampling
Online Access	Get full text
ISSN	0006-341X 1541-0420
DOI	10.1111/1541-0420.00049

Cover

More Information
Summary:	This article applies a simple method for settings where one has clustered data, but statistical methods are only available for independent data. We assume the statistical method provides us with a normally distributed estimate, , and an estimate of its variance . We randomly select a data point from each cluster and apply our statistical method to this independent data. We repeat this multiple times, and use the average of the associated as our estimate. An estimate of the variance is given by the average of the minus the sample variance of the . We call this procedure multiple outputation, as all “excess” data within each cluster is thrown out multiple times. Hoffman, Sen, and Weinberg (2001, Biometrika88, 1121–1134) introduced this approach for generalized linear models when the cluster size is related to outcome. In this article, we demonstrate the broad applicability of the approach. Applications to angular data, p‐values, vector parameters, Bayesian inference, genetics data, and random cluster sizes are discussed. In addition, asymptotic normality of estimates based on all possible outputations, as well as a finite number of outputations, is proven given weak conditions. Multiple outputation provides a simple and broadly applicable method for analyzing clustered data. It is especially suited to settings where methods for clustered data are impractical, but can also be applied generally as a quick and simple tool.
Bibliography:	http://dx.doi.org/10.1111/1541-0420.00049 ark:/67375/WNG-TXS9WRPR-Z ArticleID:BIOM49 istex:0FB59B46DC9A18C761474E993BA990A42D3B02DF ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0006-341X 1541-0420
DOI:	10.1111/1541-0420.00049