SuperSketch: A Multi-Dimensional Reversible Data Structure for Super Host Identification

Facing big network traffic data, effective data compression becomes crucially important and urgently needed for estimating host cardinalities and identifying super hosts. However, the current literature confronts several challenges: incapability of simultaneously measuring various types of host card...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on dependable and secure computing Vol. 19; no. 4; pp. 2741 - 2754
Main Authors Jing, Xuyang, Han, Hui, Yan, Zheng, Pedrycz, Witold
Format Journal Article
LanguageEnglish
Published Washington IEEE 01.07.2022
IEEE Computer Society
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Facing big network traffic data, effective data compression becomes crucially important and urgently needed for estimating host cardinalities and identifying super hosts. However, the current literature confronts several challenges: incapability of simultaneously measuring various types of host cardinalities and inability to efficiently reconstruct super host addresses. To address these challenges, in this article, we propose a novel sketch data structure, named SuperSketch, to simultaneously measure multiple types of host cardinalities with the purpose of efficiently identifying super hosts. SuperSketch has two significant characteristics: multi-dimensionality and reversibility. The multi-dimensionality makes SuperSketch capable of simultaneously measuring Source Cardinality, Destination Cardinality, and Destination Port Cardinality. The reversibility allows SuperSketch to accurately and quickly reconstruct the original addresses of super hosts once they are identified. We conduct both theoretical analysis and performance evaluation based on real-world network traffic. Experimental results show that SuperSketch achieves outstanding performance for multi-cardinality measurement, super host identification, and host address reconstruction.
ISSN:1545-5971
1941-0018
DOI:10.1109/TDSC.2021.3072295