The Performance Comparison of the Classifiers According to Binary Bow, Count Bow and Tf-Idf Feature Vectors for Malware Detection

In this paper, we compared the performance of the classifiers according to feature vectors with Binary BOW, Count BOW and TF-IDF for malware detection. We used the feature of Opcode that extracted from PE file. For performance comparison, we measured the AUC score for the classifiers those are DT, K...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of engineering & technology (Dubai) Vol. 7; no. 3.33; p. 15
Main Authors Man Kwon, Young, Hee Jun, So, Mo Gal, Won, Jae Lim, Myung
Format Journal Article
LanguageEnglish
Published 2018
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we compared the performance of the classifiers according to feature vectors with Binary BOW, Count BOW and TF-IDF for malware detection. We used the feature of Opcode that extracted from PE file. For performance comparison, we measured the AUC score for the classifiers those are DT, KNN, MLP, MNB and SVM. As a result, we recommend neural network (MLP) and instance-based model (KNN) because they show the high AUC score and accuracy regardless of the unbalanced dataset and the feature vector. If you use classical classifiers, we recommend DT because it guarantees high AUC score and accuracy regardless of the same condition as the above. If you use SVM, you have to do Robust scaling to resolved outlier and unbalanced dataset. If you use MNB, you need to use N-gram technique to improve AUC score.  
ISSN:2227-524X
2227-524X
DOI:10.14419/ijet.v7i3.33.18515