CLARO: modeling and processing uncertain data streams

Uncertain data streams, where data are incomplete and imprecise, have been observed in many environments. Feeding such data streams to existing stream systems produces results of unknown quality, which is of paramount concern to monitoring applications. In this paper, we present the claro system tha...

Full description

Saved in:
Bibliographic Details
Published inThe VLDB journal Vol. 21; no. 5; pp. 651 - 676
Main Authors Tran, Thanh T. L., Peng, Liping, Diao, Yanlei, McGregor, Andrew, Liu, Anna
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer-Verlag 01.10.2012
Springer
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Uncertain data streams, where data are incomplete and imprecise, have been observed in many environments. Feeding such data streams to existing stream systems produces results of unknown quality, which is of paramount concern to monitoring applications. In this paper, we present the claro system that supports stream processing for uncertain data naturally captured using continuous random variables. claro employs a unique data model that is flexible and allows efficient computation. Built on this model, we develop evaluation techniques for relational operators by exploring statistical theory and approximation. We also consider query planning for complex queries given an accuracy requirement. Evaluation results show that our techniques can achieve high performance while satisfying accuracy requirements and outperform state-of-the-art sampling methods.
ISSN:1066-8888
0949-877X
DOI:10.1007/s00778-011-0261-7