Forecasting across time series databases using recurrent neural networks on groups of similar series: A clustering approach

•Building a model on a set of related time series can improve the forecast accuracy.•Performance of the global models can degenerate if built on disparate time series.•A subgrouping strategy then augments the accuracies of the baseline global models. With the advent of Big Data, nowadays in many app...

Full description

Saved in:

Bibliographic Details
Published in	Expert systems with applications Vol. 140; p. 112896
Main Authors	Bandara, Kasun, Bergmeir, Christoph, Smyl, Slawek
Format	Journal Article
Language	English
Published	New York Elsevier Ltd 01.02.2020 Elsevier BV
Subjects	Algorithms Big data forecasting Clustering Competition Datasets Domains Forecasting Forecasting techniques LSTM Mathematical models Neural networks Recurrent neural networks RNN Subgroups Time series Time series clustering Big data forecasting RNN LSTM Time series clustering Neural networks
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•Building a model on a set of related time series can improve the forecast accuracy.•Performance of the global models can degenerate if built on disparate time series.•A subgrouping strategy then augments the accuracies of the baseline global models. With the advent of Big Data, nowadays in many applications databases containing large quantities of similar time series are available. Forecasting time series in these domains with traditional univariate forecasting procedures leaves great potentials for producing accurate forecasts untapped. Recurrent neural networks (RNNs), and in particular Long Short Term Memory (LSTM) networks, have proven recently that they are able to outperform state-of-the-art univariate time series forecasting methods in this context, when trained across all available time series. However, if the time series database is heterogeneous, accuracy may degenerate, so that on the way towards fully automatic forecasting methods in this space, a notion of similarity between the time series needs to be built into the methods. To this end, we present a prediction model that can be used with different types of RNN models on subgroups of similar time series, which are identified by time series clustering techniques. We assess our proposed methodology using LSTM networks, a widely popular RNN variant, together with various clustering algorithms, such as kMeans, DBScan, Partition Around Medoids (PAM), and Snob. Our method achieves competitive results on benchmarking datasets under competition evaluation procedures. In particular, in terms of mean sMAPE accuracy it consistently outperforms the baseline LSTM model, and outperforms all other methods on the CIF2016 forecasting competition dataset.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2019.112896