Nonparametric Prediction in Measurement Error Models

Predicting the value of a variable Y corresponding to a future value of an explanatory variable X, based on a sample of previously observed independent data pairs (X 1 , Y 1 ), ..., (X n , Y n ) distributed like (X, Y), is very important in statistics. In the error-free case, where X is observed acc...

Full description

Saved in:
Bibliographic Details
Published inJournal of the American Statistical Association Vol. 104; no. 487; pp. 993 - 1003
Main Authors Carroll, Raymond J., Delaigle, Aurore, Hall, Peter
Format Journal Article
LanguageEnglish
Published Alexandria, VA Taylor & Francis 01.09.2009
American Statistical Association
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Predicting the value of a variable Y corresponding to a future value of an explanatory variable X, based on a sample of previously observed independent data pairs (X 1 , Y 1 ), ..., (X n , Y n ) distributed like (X, Y), is very important in statistics. In the error-free case, where X is observed accurately, this problem is strongly related to that of standard regression estimation, since prediction of Y can be achieved via estimation of the regression curve E(Y|X). When the observed X i s and the future observation of X are measured with error, prediction is of a quite different nature. Here, if T denotes the future (contaminated) available version of X, prediction of Y can be achieved via estimation of E(Y|T). In practice, estimating E(Y|T) can be quite challenging, as data may be collected under different conditions, making the measurement errors on X i and X nonidentically distributed. We take up this problem in the nonparametric setting and introduce estimators which allow a highly adaptive approach to smoothing. Reflecting the complexity of the problem, optimal rates of convergence of estimators can vary from the semiparametric n −1/2 rate to much slower rates that are characteristic of nonparametric problems. Nevertheless, we are able to develop highly adaptive, data-driven methods that achieve very good performance in practice. This article has the supplementary materials online. Acknowledgments: Carroll's research was supported by grants from the National Cancer Institute (CA57030, CA104620). Delaigle's work was partially supported by a fellowship from the Maurice Belz foundation. Hall's work was partially supported by the Australian Reserach Council and by a grant from the National Science Foundation (DMS 0604698).
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0162-1459
1537-274X
DOI:10.1198/jasa.2009.tm07543