Linear Model Selection by Cross-validation

We consider the problem of selecting a model having the best predictive ability among a class of linear models. The popular leave-one-out cross-validation method, which is asymptotically equivalent to many other model selection methods such as the Akaike information criterion (AIC), the C p , and th...

Full description

Saved in:
Bibliographic Details
Published inJournal of the American Statistical Association Vol. 88; no. 422; pp. 486 - 494
Main Author Shao, Jun
Format Journal Article
LanguageEnglish
Published Alexandria, VA Taylor & Francis Group 01.06.1993
American Statistical Association
Taylor & Francis Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We consider the problem of selecting a model having the best predictive ability among a class of linear models. The popular leave-one-out cross-validation method, which is asymptotically equivalent to many other model selection methods such as the Akaike information criterion (AIC), the C p , and the bootstrap, is asymptotically inconsistent in the sense that the probability of selecting the model with the best predictive ability does not converge to 1 as the total number of observations n → ∞. We show that the inconsistency of the leave-one-out cross-validation can be rectified by using a leave-n v -out cross-validation with n v , the number of observations reserved for validation, satisfying n v /n → 1 as n → ∞. This is a somewhat shocking discovery, because n v /n → 1 is totally opposite to the popular leave-one-out recipe in cross-validation. Motivations, justifications, and discussions of some practical aspects of the use of the leave-n v -out cross-validation method are provided, and results from a simulation study are presented.
ISSN:0162-1459
1537-274X
DOI:10.1080/01621459.1993.10476299