Hot Topic Clustering based on Gaussian Mixture Model built-in DTW
A hot topic clustering method based on Gaussian Mixture Model (GMM) built-in DTW is proposed in this paper. As a typical application of time series clustering, the accuracy of hot topic clustering has been a challenge. Compared with Euclidean distance, DTW distance prioritizes feature alignment in t...
Saved in:
Published in | 2023 IEEE 4th International Conference on Pattern Recognition and Machine Learning (PRML) pp. 437 - 443 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
04.08.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A hot topic clustering method based on Gaussian Mixture Model (GMM) built-in DTW is proposed in this paper. As a typical application of time series clustering, the accuracy of hot topic clustering has been a challenge. Compared with Euclidean distance, DTW distance prioritizes feature alignment in the time dimension by warping the samples to a common time axis, which will make the clustering model focus on the feature differences between sequences and ignore the interference of feature time. Compared with K-Shape, a mainstream time series clustering method, DTW has more specific sequence alignment ability, which greatly improves the accuracy of the algorithm. The traditional DTW algorithm suffers from curse of dimensionality when aligning multiple vectors, so a novel DTW multi-vector consistency algorithm based on existing research is also mentioned in this paper. Compared with the traditional GMM, the fusion of this algorithm can effectively improve the accuracy of hot key words trends clustering from Google Trends by 19.9%, while maintaining the advantage of GMM over K-Means. The improved GMM has 29.9% improvement in accuracy over K-shape, which indicates that it can be well adapted to time series datasets and used in hot topic clustering. |
---|---|
DOI: | 10.1109/PRML59573.2023.10348277 |