On the design of hardware-software architectures for frequent itemsets mining on data streams

Frequent Itemsets Mining has been applied in many data processing applications with remarkable results. Recently, data streams processing is gaining a lot of attention due to its practical applications. Data in data streams are transmitted at high rates and cannot be stored for offline processing ma...

Full description

Saved in:
Bibliographic Details
Published inJournal of intelligent information systems Vol. 50; no. 3; pp. 415 - 440
Main Authors Bustio-Martínez, Lázaro, Cumplido, René, Hernández-León, Raudel, Bande-Serrano, José M., Feregrino-Uribe, Claudia
Format Journal Article
LanguageEnglish
Published New York Springer US 01.06.2018
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Frequent Itemsets Mining has been applied in many data processing applications with remarkable results. Recently, data streams processing is gaining a lot of attention due to its practical applications. Data in data streams are transmitted at high rates and cannot be stored for offline processing making impractical to use traditional data mining approaches (such as Frequent Itemsets Mining) straightforwardly on data streams. In this paper, two single-pass parallel algorithms based on a tree data structure for Frequent Itemsets Mining on data streams are proposed. The presented algorithms employ Landmark and Sliding Window Models for windows handling. In the presented paper, as in other revised papers, if the number of frequent items on data streams is low then the proposed algorithms perform an exact mining process. On the contrary, if the number of frequent patterns is large the mining process is approximate with no false positives produced. Experiments conducted demonstrate that the presented algorithms outperform the processing time of the hardware architectures reported in the state-of-the-art.
ISSN:0925-9902
1573-7675
DOI:10.1007/s10844-017-0461-8