Enhancing Time Series Clustering by Incorporating Multiple Distance Measures with Semi-Supervised Learning

Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the f...

Full description

Saved in:
Bibliographic Details
Published inJournal of computer science and technology Vol. 30; no. 4; pp. 859 - 873
Main Author 周竞 朱山风 黄晓地 张彦春
Format Journal Article
LanguageEnglish
Published New York Springer US 01.07.2015
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the first time, that no single distance measure performs significantly better than others on clustering datasets of time series where spectral clustering is used. As such, a question arises as to how to choose an appropriate measure for a given dataset of time series. To answer this question, we propose an integration scheme that incorporates multiple distance measures using semi-supervised clustering. Our approach is able to integrate all the measures by extracting valuable underlying information for the clustering. To the best of our knowledge, this work demonstrates for the first time that the semi-supervised clustering method based on constraints is able to enhance time series clustering by combining multiple distance measures. Having tested on clustering various time series datasets, we show that our method outperforms individual measures, as well as typical integration approaches.
Bibliography:11-2296/TP
time series analysis, clustering, dynamic programming, information search and retrieval
Jing Zhou , Shan-Feng Zhu,Xiaodi Huang ,Yanchun Zhang (1 School of Computer Science, Fudan University, Shanghai 200433, China; 2Shanghai Key Laboratory of Intelligent Information Processing; Fudan University, Shanghai 200433, China;3School of Computing and Mathematics, Charles Sturt University, Albury, NSW 2640, Australia ;4School of Engineering and Science, Victoria University, Melbourne, Victoria 8001, Australia ;5Shanghai Key Laboratory of Data Science, Fudan University, Shanghai 201203, China)
Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the first time, that no single distance measure performs significantly better than others on clustering datasets of time series where spectral clustering is used. As such, a question arises as to how to choose an appropriate measure for a given dataset of time series. To answer this question, we propose an integration scheme that incorporates multiple distance measures using semi-supervised clustering. Our approach is able to integrate all the measures by extracting valuable underlying information for the clustering. To the best of our knowledge, this work demonstrates for the first time that the semi-supervised clustering method based on constraints is able to enhance time series clustering by combining multiple distance measures. Having tested on clustering various time series datasets, we show that our method outperforms individual measures, as well as typical integration approaches.
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1000-9000
1860-4749
DOI:10.1007/s11390-015-1565-7