A Data-Deduplication-Based Matching Mechanism for URL Filtering

URL filtering plays an important role in various network security applications. URL filtering usually requires high matching performance, but the performance of the classical multiple string matching algorithms have been difficult to be significantly improved. In this article, we found that the onli...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE International Conference on Communications (ICC) pp. 1 - 6
Main Authors Lu, Yuhai, Liu, Yanbing, Zhang, Chunyan, Tan, Jianlong
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:URL filtering plays an important role in various network security applications. URL filtering usually requires high matching performance, but the performance of the classical multiple string matching algorithms have been difficult to be significantly improved. In this article, we found that the online URLs to be filtered contain a large number of duplicate URLs. According to this observation, we propose a novel deduplication-based matching mechanism (DBM) for URL filtering. The DBM caches information of the duplicate URLs in a hash table to avoid duplicate URLs being repeatedly scanned by URL filtering system. The DBM can be used in conjunction with any multiple string matching algorithms. Experimental results show that when a multiple string matching algorithm used in conjunction with the DBM, the matching speed of the URL filtering system can be increased by 9\%-68\%. So DBM can significantly accelerate the speed of URL filtering system. Besides increasing speed of URL filtering system, DBM is a mechanism independent of the specific matching algorithm and can be easily used in other field.
ISSN:1938-1883
DOI:10.1109/ICC.2018.8422284