Efficient mining of intra-periodic frequent sequences
Frequent Sequence Mining (FSM) is a fundamental task in data mining. Although FSM algorithms extract frequent patterns, they cannot discover patterns that periodically appear in the data. However, periodic trends are found in many areas such as market basket analysis, where discovering itemsets peri...
Saved in:
Published in | Array (New York) Vol. 16; p. 100263 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Inc
01.12.2022
Elsevier |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Frequent Sequence Mining (FSM) is a fundamental task in data mining. Although FSM algorithms extract frequent patterns, they cannot discover patterns that periodically appear in the data. However, periodic trends are found in many areas such as market basket analysis, where discovering itemsets periodically purchased by customers can help understand periodic customer behavior. This is the task of Periodic Frequent Pattern Mining (PFPM). A major limitation common to traditional PFPM algorithms is that they reduce the periodicity between non-disjoint itemsets. They do not take into account the periods between disjoint itemsets. Thus, they find itemsets that appear periodically, but would fail to find a periodic appearance of distinct itemsets. To address this limitation, this paper extends the traditional problem of FSM with intra-periodicity and provides a theoretical background to extract intra-periodic frequent sequences. This leads to a new mining algorithm called Intra-Periodic Frequent Sequence Miner. Experimental results confirm its efficiency.
•Intra-periodicity is integrated into the problem of mining frequent sequences.•A theoretical background for the extraction of intra-periodic frequent sequences.•New pruning and partitioning strategies of the search space.•A new algorithm called Intra-Periodic Frequent Sequence Miner (IPFSM).•The correctness of IPFSM is proven and experiments confirm its efficiency. |
---|---|
ISSN: | 2590-0056 2590-0056 |
DOI: | 10.1016/j.array.2022.100263 |