Model-based clustering and segmentation of time series with changes in regime

Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the Expectation–Maximization (EM) algorithm. Within the context of a railway ap...

Full description

Saved in:
Bibliographic Details
Published inAdvances in data analysis and classification Vol. 5; no. 4; pp. 301 - 321
Main Authors Samé, Allou, Chamroukhi, Faicel, Govaert, Gérard, Aknin, Patrice
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer-Verlag 01.12.2011
Springer Verlag
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the Expectation–Maximization (EM) algorithm. Within the context of a railway application, this paper introduces a novel mixture model for dealing with time series that are subject to changes in regime. The proposed approach, called ClustSeg, consists in modeling each cluster by a regression model in which the polynomial coefficients vary according to a discrete hidden process. In particular, this approach makes use of logistic functions to model the (smooth or abrupt) transitions between regimes. The model parameters are estimated by the maximum likelihood method solved by an EM algorithm. This approach can also be regarded as a clustering approach which operates by finding groups of time series having common changes in regime. In addition to providing a time series partition, it therefore provides a time series segmentation. The problem of selecting the optimal numbers of clusters and segments is solved by means of the Bayesian Information Criterion. The ClustSeg approach is shown to be efficient using a variety of simulated time series and real-world time series of electrical power consumption from rail switching operations.
ISSN:1862-5347
1862-5355
DOI:10.1007/s11634-011-0096-5