Weighted Contiguous Sequential Pattern Mining

In real-life, big data is contiguous, such as traffic flow and network flow, and thus some contiguous mining algorithms have been developed. It has been noticed that the significance of data (e.g., DNA sequences) is often different, and the real data may have various weights. However, the continuity...

Full description

Saved in:
Bibliographic Details
Published in2022 4th International Conference on Data Intelligence and Security (ICDIS) pp. 358 - 365
Main Authors Zhou, Tingfu, Gan, Wensheng, Qi, Zhenlian
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.08.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In real-life, big data is contiguous, such as traffic flow and network flow, and thus some contiguous mining algorithms have been developed. It has been noticed that the significance of data (e.g., DNA sequences) is often different, and the real data may have various weights. However, the continuity of the mined data is not fully considered in the existing weighted mining algorithms. In this study, we are the first to formulate the problem of mining weighted contiguous sequential patterns and propose a new algorithm named WCSpan. Based on the usage of modified prefix pattern expansion and a tight weighted upper-bound model, we proved that WCSpan can efficiently mine the weighted contiguous sequential patterns. Experimental results showed that compared with existing similar algorithms, the proposed algorithm has advantages in execution time and memory usage. Besides, the integrity of the outcome patterns of WCSpan is preserved while data omission is avoided. In addition, the generation of patterns by the WCSpan method is faster than other methods, where the weighted upper-bound model can prune redundant candidates precisely to save memory. Both of them significantly improve the performance of WCSpan.
DOI:10.1109/ICDIS55630.2022.00061