A note on the validity of cross-validation for evaluating autoregressive time series prediction

One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not stra...

Full description

Saved in:
Bibliographic Details
Published inComputational statistics & data analysis Vol. 120; pp. 70 - 83
Main Authors Bergmeir, Christoph, Hyndman, Rob J., Koo, Bonsoo
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.04.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not straightforward and often replaced by practitioners in favour of an out-of-sample (OOS) evaluation. It is shown that for purely autoregressive models, the use of standard K-fold CV is possible provided the models considered have uncorrelated errors. Such a setup occurs, for example, when the models nest a more appropriate model. This is very common when Machine Learning methods are used for prediction, and where CV can control for overfitting the data. Theoretical insights supporting these arguments are presented, along with a simulation study and a real-world example. It is shown empirically that K-fold CV performs favourably compared to both OOS evaluation and other time-series-specific techniques such as non-dependent cross-validation.
ISSN:0167-9473
1872-7352
DOI:10.1016/j.csda.2017.11.003