Speeding up dynamic time warping distance for sparse time series data

Dynamic time warping (DTW) distance has been effectively used in mining time series data in a multitude of domains. However, in its original formulation DTW is extremely inefficient in comparing long sparse time series, containing mostly zeros and some unevenly spaced nonzero observations. Original...

Full description

Saved in:
Bibliographic Details
Published inKnowledge and information systems Vol. 54; no. 1; pp. 237 - 263
Main Authors Mueen, Abdullah, Chavoshi, Nikan, Abu-El-Rub, Noor, Hamooni, Hossein, Minnich, Amanda, MacCarthy, Jonathan
Format Journal Article
LanguageEnglish
Published London Springer London 2018
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Dynamic time warping (DTW) distance has been effectively used in mining time series data in a multitude of domains. However, in its original formulation DTW is extremely inefficient in comparing long sparse time series, containing mostly zeros and some unevenly spaced nonzero observations. Original DTW distance does not take advantage of this sparsity, leading to redundant calculations and a prohibitively large computational cost for long time series. We derive a new time warping similarity measure (AWarp) for sparse time series that works on the run-length encoded representation of sparse time series. The complexity of AWarp is quadratic on the number of observations as opposed to the range of time of the time series. Therefore, AWarp can be several orders of magnitude faster than DTW on sparse time series. AWarp is exact for binary-valued time series and a close approximation of the original DTW distance for any-valued series. We discuss useful variants of AWarp: bounded (both upper and lower), constrained, and multidimensional. We show applications of AWarp to three data mining tasks including clustering, classification, and outlier detection, which are otherwise not feasible using classic DTW, while producing equivalent results. Potential areas of application include bot detection, human activity classification, search trend analysis, seismic analysis, and unusual review pattern mining.
ISSN:0219-1377
0219-3116
DOI:10.1007/s10115-017-1119-0