Evaluation Procedures for Forecasting with Spatio-Temporal Data

The amount of available spatio-temporal data has been increasing as large-scale data collection (e.g., from geosensor networks) becomes more prevalent. This has led to an increase in spatio-temporal forecasting applications using geo-referenced time series data motivated by important domains such as...

Full description

Saved in:

Bibliographic Details
Published in	Machine Learning and Knowledge Discovery in Databases Vol. 11051; pp. 703 - 718
Main Authors	Oliveira, Mariana, Torgo, Luís, Santos Costa, Vítor
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2019 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Cross-validation Evaluation methods Geo-referenced time series Performance estimation Reproducible research Spatio-temporal data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The amount of available spatio-temporal data has been increasing as large-scale data collection (e.g., from geosensor networks) becomes more prevalent. This has led to an increase in spatio-temporal forecasting applications using geo-referenced time series data motivated by important domains such as environmental monitoring (e.g., air pollution index, forest fire risk prediction). Being able to properly assess the performance of new forecasting approaches is fundamental to achieve progress. However, the dependence between observations that the spatio-temporal context implies, besides being challenging in the modelling step, also raises issues for performance estimation as indicated by previous work. In this paper, we empirically compare several variants of cross-validation (CV) and out-of-sample (OOS) performance estimation procedures that respect data ordering, using both artificially generated and real-world spatio-temporal data sets. Our results show both CV and OOS reporting useful estimates. Further, they suggest that blocking may be useful in addressing CV’s bias to underestimate error. OOS can be very sensitive to test size, as expected, but estimates can be improved by careful management of the temporal dimension in training. Code related to this paper is available at: https://github.com/mrfoliveira/Evaluation-procedures-for-forecasting-with-spatio-temporal-data.
Bibliography:	Electronic supplementary materialThe online version of this chapter (https://doi.org/10.1007/978-3-030-10925-7_43) contains supplementary material, which is available to authorized users.
ISBN:	9783030109240 3030109240
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-10925-7_43