Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework

•An efficient EFUPM to discover the fuzzy high-utility patterns is proposed.•A Hadoop-based HFUPM is proposed to handle large-scale databases.•Two upper-bounds are then designed to early remove the unpromising candidates.•Experiments showed that the better performance can be obtained. Over the past...

Full description

Saved in:
Bibliographic Details
Published inInformation sciences Vol. 553; pp. 31 - 48
Main Authors Wu, Jimmy Ming-Tai, Srivastava, Gautam, Wei, Min, Yun, Unil, Lin, Jerry Chun-Wei
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.04.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•An efficient EFUPM to discover the fuzzy high-utility patterns is proposed.•A Hadoop-based HFUPM is proposed to handle large-scale databases.•Two upper-bounds are then designed to early remove the unpromising candidates.•Experiments showed that the better performance can be obtained. Over the past decade, high-utility itemset mining (HUIM) has received widespread attention that can emphasize more critical information than was previously possible using frequent itemset mining (FIM). Unfortunately, HUIM is very similar to FIM since the methodology determines itemsets using a binary model based on a pre-defined minimum utility threshold. Additionally, most previous works only focused on single, small datasets in HUIM, which is not realistic to any real-world scenarios today containing big data environments. In this work, the fuzzy-set theory and a MapReduce framework are both utilized to design a novel high fuzzy utility pattern mining algorithm to resolve the above issues. Fuzzy-set theory is first involved and a new algorithm called efficient high fuzzy utility itemset mining (EFUPM) is designed to discover high fuzzy utility patterns from a single machine. Two upper-bounds are then estimated to allow early pruning of unpromising candidates in the search space. To handle the large-scale of big datasets, a Hadoop-based high fuzzy utility pattern mining (HFUPM) algorithm is then developed to discover high fuzzy utility patterns based on the Hadoop framework. Experimental results clearly show that the proposed algorithms perform strongly to mine the required high fuzzy utility patterns whether in a single machine or a large-scale environment compared to the current state-of-the-art approaches.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2020.12.004