Gambling Domain Name Recognition via Certificate and Textual Analysis

Abstract On-line gambling is the key illegal behaviour of public security department in most countries due to the potential threat to cyberspace security and social stability. Hence, the research on gambling domain names (GDN) classification is quite important and in great demand for academia and in...

Full description

Saved in:
Bibliographic Details
Published inComputer journal Vol. 66; no. 8; pp. 1829 - 1839
Main Authors Sun, GuoYing, Ye, Feng, Chai, Tingting, Zhang, Zhaoxin, Tong, Xiaojun, Prasad, Shitala
Format Journal Article
LanguageEnglish
Published Oxford University Press 14.08.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract On-line gambling is the key illegal behaviour of public security department in most countries due to the potential threat to cyberspace security and social stability. Hence, the research on gambling domain names (GDN) classification is quite important and in great demand for academia and industry. Till now, there is very little research work on this topic. Most of the GDN training datasets in previous work were chosen from GDN blacklists provided by publicly available data sources, and the authors did not verify the authenticity and accuracy of these datasets, and the classification results are not particularly satisfactory. In this paper, certificated and textual analysis-based classification method CT-GDNC is proposed to get GDN training data set with an accuracy of 0.9776 and significantly improve the classification results of GDN. The exhaustive comparative experiments on 10K GDN obtained via Bert fine-tuning model and 10K benign data collected from Alex Top 1 million list show that the proposed method achieves new baseline result for GDN classification with classification accuracy 0.9936, precision 0.9936, F1 0.9936 and recall 0.9939.
ISSN:0010-4620
1460-2067
DOI:10.1093/comjnl/bxac043