A sliding window-based false-negative approach for ubiquitous data stream analysis

SUMMARY Ubiquitous data stream mining (UDSM) is the process of performing data analysis on mobile, embedded and ubiquitous devices. In many cases, a large volume of data can be mined for interesting and relevant information in a wide variety of applications. Data stream mining requires computational...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of communication systems Vol. 25; no. 6; pp. 691 - 716
Main Authors Kim, Younghee, Park, Doo-soon, Kim, Heewan, Kim, Ungmo
Format Journal Article
LanguageEnglish
Published Chichester, UK John Wiley & Sons, Ltd 01.06.2012
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:SUMMARY Ubiquitous data stream mining (UDSM) is the process of performing data analysis on mobile, embedded and ubiquitous devices. In many cases, a large volume of data can be mined for interesting and relevant information in a wide variety of applications. Data stream mining requires computationally intensive mining techniques to be applied in mobile environments constrained by analysis of a real‐time single pass with limited computational resources. Therefore, we have to ensure that the result is within the error tolerance range. In this paper, we suggest a method for a false‐negative approach based on the Chernoff bound for efficient analysis of the data stream. Hence, we consider the problem of approximating frequency counts for space‐efficient computation over data stream sliding windows. We show that a false‐negative approach allowing a controlled number of frequent itemsets to be missing from the output is a more promising solution for mining frequent itemsets from a ubiquitous data stream. These are simple to implement, and have provable quality, space, and time guarantees. The experimental results have shown that the proposed algorithms achieve a high accuracy of at least 99% and require a small execution time. Copyright © 2011 John Wiley & Sons, Ltd. In this paper, we propose the problem of approximate frequency counts for space efficient computation over data stream sliding windows. Our proposed method can be controlled by a predefined parameter so that a desired recall rate of frequent itemsets can be guaranteed. Also our method is for dynamically maintain potential frequent itemsets with high probability in a potential table and have provable quality, space and time guarantees. Copyright © 2011 John Wiley & Sons, Ltd.
Bibliography:ArticleID:DAC1211
ark:/67375/WNG-LGQDRVNT-W
istex:8268654217EEF31A36EB7948319239365067D217
ISSN:1074-5351
1099-1131
DOI:10.1002/dac.1211