Enhancing Time Series Clustering by Incorporating Multiple Distance Measures with Semi-Supervised Learning
Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the f...
Saved in:
Published in | Journal of computer science and technology Vol. 30; no. 4; pp. 859 - 873 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.07.2015
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the first time, that no single distance measure performs significantly better than others on clustering datasets of time series where spectral clustering is used. As such, a question arises as to how to choose an appropriate measure for a given dataset of time series. To answer this question, we propose an integration scheme that incorporates multiple distance measures using semi-supervised clustering. Our approach is able to integrate all the measures by extracting valuable underlying information for the clustering. To the best of our knowledge, this work demonstrates for the first time that the semi-supervised clustering method based on constraints is able to enhance time series clustering by combining multiple distance measures. Having tested on clustering various time series datasets, we show that our method outperforms individual measures, as well as typical integration approaches. |
---|---|
Bibliography: | 11-2296/TP time series analysis, clustering, dynamic programming, information search and retrieval Jing Zhou , Shan-Feng Zhu,Xiaodi Huang ,Yanchun Zhang (1 School of Computer Science, Fudan University, Shanghai 200433, China; 2Shanghai Key Laboratory of Intelligent Information Processing; Fudan University, Shanghai 200433, China;3School of Computing and Mathematics, Charles Sturt University, Albury, NSW 2640, Australia ;4School of Engineering and Science, Victoria University, Melbourne, Victoria 8001, Australia ;5Shanghai Key Laboratory of Data Science, Fudan University, Shanghai 201203, China) Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the first time, that no single distance measure performs significantly better than others on clustering datasets of time series where spectral clustering is used. As such, a question arises as to how to choose an appropriate measure for a given dataset of time series. To answer this question, we propose an integration scheme that incorporates multiple distance measures using semi-supervised clustering. Our approach is able to integrate all the measures by extracting valuable underlying information for the clustering. To the best of our knowledge, this work demonstrates for the first time that the semi-supervised clustering method based on constraints is able to enhance time series clustering by combining multiple distance measures. Having tested on clustering various time series datasets, we show that our method outperforms individual measures, as well as typical integration approaches. ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 1000-9000 1860-4749 |
DOI: | 10.1007/s11390-015-1565-7 |