A note on the validity of cross-validation for evaluating autoregressive time series prediction

One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not stra...

Full description

Saved in:

Bibliographic Details
Published in	Computational statistics & data analysis Vol. 120; pp. 70 - 83
Main Authors	Bergmeir, Christoph, Hyndman, Rob J., Koo, Bonsoo
Format	Journal Article
Language	English
Published	Elsevier B.V 01.04.2018
Subjects	Autoregression Cross-validation Time series Autoregression Cross-validation Time series
Online Access	Get full text

Cover

Loading…

More Information
Summary:	One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not straightforward and often replaced by practitioners in favour of an out-of-sample (OOS) evaluation. It is shown that for purely autoregressive models, the use of standard K-fold CV is possible provided the models considered have uncorrelated errors. Such a setup occurs, for example, when the models nest a more appropriate model. This is very common when Machine Learning methods are used for prediction, and where CV can control for overfitting the data. Theoretical insights supporting these arguments are presented, along with a simulation study and a real-world example. It is shown empirically that K-fold CV performs favourably compared to both OOS evaluation and other time-series-specific techniques such as non-dependent cross-validation.
ISSN:	0167-9473 1872-7352
DOI:	10.1016/j.csda.2017.11.003