A Hybrid Deep Network Framework for Android Malware Detection

Android is a growing target for malicious software (malware) because of its popularity and functionality. Malware poses a serious threat to users' privacy, money, equipment and file integrity. A series of data-driven malware detection methods were proposed. However, there exist two key challeng...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on knowledge and data engineering Vol. 34; no. 12; pp. 5558 - 5570
Main Authors	Zhu, Hui-Juan, Wang, Liang-Min, Zhong, Sheng, Li, Yang, Sheng, Victor S.
Format	Journal Article
Language	English
Published	New York IEEE 01.12.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Coders Deep learning Feature extraction Feature extraction or construction Learning systems Machine learning Malware modeling and prediction neural nets Performance enhancement Representations Smart phones Static analysis Support vector machines Teaching methods
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Android is a growing target for malicious software (malware) because of its popularity and functionality. Malware poses a serious threat to users' privacy, money, equipment and file integrity. A series of data-driven malware detection methods were proposed. However, there exist two key challenges for these methods: (1) how to learn effective feature representation from raw data; (2) how to reduce the dependence on the prior knowledge or human labors in feature learning. Inspired by the success of deep learning methods in the feature representation learning community, we propose a malware detection framework which starts with learning rich-features by a novel unsupervised feature learning algorithm Merged Sparse Auto-Encoder (MSAE). In order to extract more compact and discriminative feature from the rich-features to further boost the malware detection capability, a hybrid deep network learning algorithm Stacked Hybrid Learning MSAE and SDAE (SHLMD) is established by further incorporating a classical deep learning method Stacked Denoising Auto-encoders (SDAE). After that, we feed the feature learned by MSAE and SHLMD respectively to classification algorithms, e.g., Support Vector Machine (SVM) or K-NearestNeighbor (KNN), to train a malware detection model. Evaluation results on two real-world datasets demonstrate that SHLMD achieves 94.46 and 90.57 percent accuracy respectively, which outperforms the classical unsupervised feature representation learning Sparse Auto-encoder (SAE). MSAE performs similarly to SAE. SHLMD can further improve the performance of MSAE and the supervised fine-tuned method SDAE. Besides, we compare the performance of our methods with that of state-of-the-art detection approaches, including classical deep-learning-based methods. Extensive experiments show that our proposed methods are effective enough to detect Android malware.
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2021.3067658