A sliding window-based false-negative approach for ubiquitous data stream analysis
SUMMARY Ubiquitous data stream mining (UDSM) is the process of performing data analysis on mobile, embedded and ubiquitous devices. In many cases, a large volume of data can be mined for interesting and relevant information in a wide variety of applications. Data stream mining requires computational...
Saved in:
Published in | International journal of communication systems Vol. 25; no. 6; pp. 691 - 716 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Chichester, UK
John Wiley & Sons, Ltd
01.06.2012
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | SUMMARY
Ubiquitous data stream mining (UDSM) is the process of performing data analysis on mobile, embedded and ubiquitous devices. In many cases, a large volume of data can be mined for interesting and relevant information in a wide variety of applications. Data stream mining requires computationally intensive mining techniques to be applied in mobile environments constrained by analysis of a real‐time single pass with limited computational resources. Therefore, we have to ensure that the result is within the error tolerance range. In this paper, we suggest a method for a false‐negative approach based on the Chernoff bound for efficient analysis of the data stream. Hence, we consider the problem of approximating frequency counts for space‐efficient computation over data stream sliding windows. We show that a false‐negative approach allowing a controlled number of frequent itemsets to be missing from the output is a more promising solution for mining frequent itemsets from a ubiquitous data stream. These are simple to implement, and have provable quality, space, and time guarantees. The experimental results have shown that the proposed algorithms achieve a high accuracy of at least 99% and require a small execution time. Copyright © 2011 John Wiley & Sons, Ltd.
In this paper, we propose the problem of approximate frequency counts for space efficient computation over data stream sliding windows. Our proposed method can be controlled by a predefined parameter so that a desired recall rate of frequent itemsets can be guaranteed. Also our method is for dynamically maintain potential frequent itemsets with high probability in a potential table and have provable quality, space and time guarantees. Copyright © 2011 John Wiley & Sons, Ltd. |
---|---|
Bibliography: | ArticleID:DAC1211 ark:/67375/WNG-LGQDRVNT-W istex:8268654217EEF31A36EB7948319239365067D217 |
ISSN: | 1074-5351 1099-1131 |
DOI: | 10.1002/dac.1211 |