Linear Model Selection by Cross-validation

We consider the problem of selecting a model having the best predictive ability among a class of linear models. The popular leave-one-out cross-validation method, which is asymptotically equivalent to many other model selection methods such as the Akaike information criterion (AIC), the C p , and th...

Full description

Saved in:

Bibliographic Details
Published in	Journal of the American Statistical Association Vol. 88; no. 422; pp. 486 - 494
Main Author	Shao, Jun
Format	Journal Article
Language	English
Published	Alexandria, VA Taylor & Francis Group 01.06.1993 American Statistical Association Taylor & Francis Ltd
Subjects	Balanced incomplete Consistency Consistent estimators Data models Data splitting Datasets Error rates Exact sciences and technology Integers Linear inference, regression Linear models Mathematical models Mathematics Model assessment Modeling Monte Carlo Prediction Predictive modeling Probability Probability and statistics Sciences and techniques of general use Statistical theories Statistical variance Statistics Theory and Methods Statistical simulation Algorithm Linear model Consistency Prediction
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We consider the problem of selecting a model having the best predictive ability among a class of linear models. The popular leave-one-out cross-validation method, which is asymptotically equivalent to many other model selection methods such as the Akaike information criterion (AIC), the C p , and the bootstrap, is asymptotically inconsistent in the sense that the probability of selecting the model with the best predictive ability does not converge to 1 as the total number of observations n → ∞. We show that the inconsistency of the leave-one-out cross-validation can be rectified by using a leave-n v -out cross-validation with n v , the number of observations reserved for validation, satisfying n v /n → 1 as n → ∞. This is a somewhat shocking discovery, because n v /n → 1 is totally opposite to the popular leave-one-out recipe in cross-validation. Motivations, justifications, and discussions of some practical aspects of the use of the leave-n v -out cross-validation method are provided, and results from a simulation study are presented.
ISSN:	0162-1459 1537-274X
DOI:	10.1080/01621459.1993.10476299