Covariate selection for the nonparametric estimation of an average treatment effect

Observational studies in which the effect of a nonrandomized treatment on an outcome of interest is estimated are common in domains such as labour economics and epidemiology. Such studies often rely on an assumption of unconfounded treatment when controlling for a given set of observed pre-treatment...

Full description

Saved in:
Bibliographic Details
Published inBiometrika Vol. 98; no. 4; pp. 861 - 875
Main Authors Luna, Xavier De, Waernbaum, Ingeborg, Richardson, Thomas S
Format Journal Article
LanguageEnglish
Published Oxford University Press for Biometrika Trust 01.12.2011
SeriesBiometrika
Online AccessGet more information

Cover

Loading…
More Information
Summary:Observational studies in which the effect of a nonrandomized treatment on an outcome of interest is estimated are common in domains such as labour economics and epidemiology. Such studies often rely on an assumption of unconfounded treatment when controlling for a given set of observed pre-treatment covariates. The choice of covariates to control in order to guarantee unconfoundedness should primarily be based on subject matter theories, although the latter typically give only partial guidance. It is tempting to include many covariates in the controlling set to try to make the assumption of an unconfounded treatment realistic. Including unnecessary covariates is suboptimal when the effect of a binary treatment is estimated nonparametrically. For instance, when using a n-super-1/2-consistent estimator, a loss of efficiency may result from using covariates that are irrelevant for the unconfoundedness assumption. Moreover, bias may dominate the variance when many covariates are used. Embracing the Neyman--Rubin model typically used in conjunction with nonparametric estimators of treatment effects, we characterize subsets from the original reservoir of covariates that are minimal in the sense that the treatment ceases to be unconfounded given any proper subset of these minimal sets. These subsets of covariates are shown to be identified under mild assumptions. These results lead us to propose data-driven algorithms for the selection of minimal sets of covariates. Copyright 2011, Oxford University Press.
ISSN:0006-3444
1464-3510
DOI:10.1093/biomet/asr041