Phishing website identification based on double weight random forest

Aiming at the problems of insufficient detection accuracy and high misjudgment rate caused by a large amount of redundant data, a random forest algorithm based on the combination of feature weight selection and decision tree weight was proposed to construct a phishing website detection model. The fe...

Full description

Saved in:
Bibliographic Details
Published in2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA) pp. 263 - 266
Main Authors Zhou, Zhixin, Zhang, Chenghaoyue
Format Conference Proceeding
LanguageEnglish
Published IEEE 20.05.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Aiming at the problems of insufficient detection accuracy and high misjudgment rate caused by a large amount of redundant data, a random forest algorithm based on the combination of feature weight selection and decision tree weight was proposed to construct a phishing website detection model. The feature data uses the clustering algorithm to form clusters, selects the features inside and at the edge of the cluster to train the decision tree, inputs the test data set to calculate the weight of each decision tree, and combines the improved Bayesian formula to determine each decision tree. The weight of the decision tree is finally formed into a double-weight random forest algorithm, which can improve the accuracy of phishing website detection.
DOI:10.1109/CVIDLICCEA56201.2022.9824544