Support estimation in frequent itemsets mining on Enriched Two Level Tree

Efficiently counting the support of candidate itemsets is a crucial aspect of extracting frequent itemsets because it directly impacts the overall performance of the mining process. Researchers have developed various techniques and data structures to overcome this challenge, but the problem is still...

Full description

Saved in:
Bibliographic Details
Published inInformation systems (Oxford) Vol. 133; p. 102559
Main Authors Tayou Djamegni, Clémentin, Ndemaze, William Kery Branston, Kenmogne, Edith Belise, Nana Kouassi, Hervé Maradona, Nzegha Fountsop, Arnauld, Tetakouchom, Idriss, Tabueu Fotso, Laurent Cabrel
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.08.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Efficiently counting the support of candidate itemsets is a crucial aspect of extracting frequent itemsets because it directly impacts the overall performance of the mining process. Researchers have developed various techniques and data structures to overcome this challenge, but the problem is still open. In this paper, we investigate the two-level tree enrichment technique as a potential solution without adding significant computational overhead. In addition, we introduce ETL_Miner, a novel algorithm that provides an estimated bound for the support value of all candidate itemsets within the search space. The method presented in this article is flexible and can be used with various algorithms. To demonstrate this point, we introduce a modified version of Apriori that integrates ETL_Miner as an extra pruning phase. Preliminary empirical experimental results on both real and synthetic datasets confirm the accuracy of the proposed method and reduce the total extraction time. •Design of a new enriched data structure called ETL_Tree (Enriched Two Level Tree).•Proposal of the ETL_Miner algorithm, based on ETL_Tree, to estimate itemset support.•Relevance of ETL_Miner is verified by adding it as an extra pruning phase in Apriori.•We performed experiments on real and synthetic datasets to evaluate method accuracy.
ISSN:0306-4379
DOI:10.1016/j.is.2025.102559