Detection of Exceptional Malware Variants Using Deep Boosted Feature Spaces and Machine Learning

Malware is a key component of cyber-crime, and its analysis is the first line of defence against cyber-attack. This study proposes two new malware classification frameworks: Deep Feature Space-based Malware classification (DFS-MC) and Deep Boosted Feature Space-based Malware classification (DBFS-MC)...

Full description

Saved in:
Bibliographic Details
Published inApplied sciences Vol. 11; no. 21; p. 10464
Main Authors Asam, Muhammad, Hussain, Shaik Javeed, Mohatram, Mohammed, Khan, Saddam Hussain, Jamal, Tauseef, Zafar, Amad, Khan, Asifullah, Ali, Muhammad Umair, Zahoora, Umme
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.11.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Malware is a key component of cyber-crime, and its analysis is the first line of defence against cyber-attack. This study proposes two new malware classification frameworks: Deep Feature Space-based Malware classification (DFS-MC) and Deep Boosted Feature Space-based Malware classification (DBFS-MC). In the proposed DFS-MC framework, deep features are generated from the customized CNN architectures and are fed to a support vector machine (SVM) algorithm for malware classification, while, in the DBFS-MC framework, the discrimination power is enhanced by first combining deep feature spaces of two customized CNN architectures to achieve boosted feature spaces. Further, the detection of exceptional malware is performed by providing the deep boosted feature space to SVM. The performance of the proposed malware classification frameworks is evaluated on the MalImg malware dataset using the hold-out cross-validation technique. Malware variants like Autorun.K, Swizzor.gen!I, Wintrim.BX and Yuner.A is hard to be correctly classified due to their minor inter-class differences in their features. The proposed DBFS-MC improved performance for these difficult to discriminate malware classes using the idea of feature boosting generated through customized CNNs. The proposed classification framework DBFS-MC showed good results in term of accuracy: 98.61%, F-score: 0.96, precision: 0.96, and recall: 0.96 on stringent test data, using 40% unseen data.
ISSN:2076-3417
2076-3417
DOI:10.3390/app112110464