Straggler Identification in Round-Trip Data Streams via Newton's Identities and Invertible Bloom Filters

In this paper, we study the straggler identification problem, in which an algorithm must determine the identities of the remaining members of a set after it has had a large number of insertion and deletion operations performed on it, and now has relatively few remaining members. The goal is to do th...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on knowledge and data engineering Vol. 23; no. 2; pp. 297 - 306
Main Authors	Eppstein, D, Goodrich, M T
Format	Journal Article
Language	English
Published	New York, NY IEEE 01.02.2011 IEEE Computer Society The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithm design and analysis Algorithms Applied sciences Bloom filters Complexity theory Computer science; control theory; systems Computer systems and distributed systems. User interface data streams Deletion Exact sciences and technology Finite element methods Handles Insertion Lower bounds Multicast Newton method Newton's identities Polynomials Servers Software Straggler identification Streams Studies Streaming Lower bound Newton's identities Probabilistic approach data streams Bloom filter Bloom filters Distributed system Multicast Randomization Identifier Straggler identification Multiplicity Bandwidth Multiset Deterministic approach
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we study the straggler identification problem, in which an algorithm must determine the identities of the remaining members of a set after it has had a large number of insertion and deletion operations performed on it, and now has relatively few remaining members. The goal is to do this in o(n) space, where n is the total number of identities. Straggler identification has applications, for example, in determining the unacknowledged packets in a high-bandwidth multicast data stream. We provide a deterministic solution to the straggler identification problem that uses only O(d log n) bits, based on a novel application of Newton's identities for symmetric polynomials. This solution can identify any subset of d stragglers from a set of n O(log n)-bit identifiers, assuming that there are no false deletions of identities not already in the set. Indeed, we give a lower bound argument that shows that any small-space deterministic solution to the straggler identification problem cannot be guaranteed to handle false deletions. Nevertheless, we provide a simple randomized solution, using O(d log n log (1/∈)) bits that can maintain a multiset and solve the straggler identification problem, tolerating false deletions, where ∈ > 0 is a user-defined parameter bounding the probability of an incorrect response. This randomized solution is based on a new type of Bloom filter, which we call the invertible Bloom filter.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2010.132