Embedding-based subsequence matching in time-series databases

We propose an embedding-based framework for subsequence matching in time-series databases that improves the efficiency of processing subsequence matching queries under the Dynamic Time Warping (DTW) distance measure. This framework partially reduces subsequence matching to vector matching, using an...

Full description

Saved in:
Bibliographic Details
Published inACM transactions on database systems Vol. 36; no. 3; pp. 1 - 39
Main Authors Papapetrou, Panagiotis, Athitsos, Vassilis, Potamias, Michalis, Kollios, George, Gunopulos, Dimitrios
Format Journal Article
LanguageEnglish
Published New York, NY Association for Computing Machinery 01.08.2011
Subjects
Online AccessGet full text
ISSN0362-5915
1557-4644
DOI10.1145/2000824.2000827

Cover

More Information
Summary:We propose an embedding-based framework for subsequence matching in time-series databases that improves the efficiency of processing subsequence matching queries under the Dynamic Time Warping (DTW) distance measure. This framework partially reduces subsequence matching to vector matching, using an embedding that maps each query sequence to a vector and each database time series into a sequence of vectors. The database embedding is computed offline, as a preprocessing step. At runtime, given a query object, an embedding of that object is computed online. Relatively few areas of interest are efficiently identified in the database sequences by comparing the embedding of the query with the database vectors. Those areas of interest are then fully explored using the exact DTW-based subsequence matching algorithm. We apply the proposed framework to define two specific methods. The first method focuses on time-series subsequence matching under unconstrained Dynamic Time Warping. The second method targets subsequence matching under constrained Dynamic Time Warping (cDTW), where warping paths are not allowed to stray too much off the diagonal. In our experiments, good trade-offs between retrieval accuracy and retrieval efficiency are obtained for both methods, and the results are competitive with respect to current state-of-the-art methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0362-5915
1557-4644
DOI:10.1145/2000824.2000827