Space-efficient Relative Error Order Sketch over Data Streams

We consider the problem of continuously maintaining order sketches over data streams with a relative rank error guarantee ∊. Novel space-efficient and one-scan randomised techniques are developed. Our first randomised algorithm can guarantee such a relative error precision ∊ with confidence 1 - \del...

Full description

Saved in:
Bibliographic Details
Published in22nd International Conference on Data Engineering (ICDE'06) p. 51
Main Authors Ying Zhang, Xuemin Lin, Jian Xu, Korn, F., Wei Wang
Format Conference Proceeding
LanguageEnglish
Published IEEE 2006
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We consider the problem of continuously maintaining order sketches over data streams with a relative rank error guarantee ∊. Novel space-efficient and one-scan randomised techniques are developed. Our first randomised algorithm can guarantee such a relative error precision ∊ with confidence 1 - \delta using O( 1\_ \in \frac{1} {2}2 log 1d log ∊^2N) space, where N is the number of data elements seen so far in a data stream. Then, a new one-scan space compression technique is developed. Combined with the first randomised algorithm, the one-scan space compression technique yields another one-scan randomised algorithm that guarantees the space requirement is O( 1\frac{1} { \in } log(1\frac{1}{ \in } log 1\begin{gathered} \frac{1}{\delta } \hfill \\ \hfill \\ \end{gathered} )\frac{{\log ^{2 + \alpha } \in N}} {{1 - 1/2^\alpha }} (for\alpha \gt 0) on average while the worst case space remains O( \frac{1}{{ \in ^2 }}\log \frac{1} {\delta }\log \in ^2 N). These results are immediately applicable to approximately computing quantiles over data streams with a relative error guarantee \in and significantly improve the previous best space bound O( \frac{1} {{ \in ^3 }}\log \frac{1}{\delta }\log N). Our extensive experiment results demonstrate that both techniques can support an on-line computation against high speed data streams.
ISBN:0769525709
9780769525709
ISSN:1063-6382
2375-026X
DOI:10.1109/ICDE.2006.145