SysDroid: a dynamic ML-based android malware analyzer using system call traces

Android is a popular open-source operating system highly susceptible to malware attacks. Researchers have developed machine learning models, learned from attributes extracted using static/dynamic approaches to identify malicious applications. However, such models suffer from low detection accuracy,...

Full description

Saved in:
Bibliographic Details
Published inCluster computing Vol. 23; no. 4; pp. 2789 - 2808
Main Authors Ananya, A., Aswathy, A., Amal, T. R., Swathy, P. G., Vinod, P., Mohammad, Shojafar
Format Journal Article
LanguageEnglish
Published New York Springer US 01.12.2020
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Android is a popular open-source operating system highly susceptible to malware attacks. Researchers have developed machine learning models, learned from attributes extracted using static/dynamic approaches to identify malicious applications. However, such models suffer from low detection accuracy, due to the presence of noisy attributes, extracted from conventional feature selection algorithms. Hence, in this paper, a new feature selection mechanism known as  selection of relevant attributes for improving locally extracted features using classical feature selectors  (SAILS), is proposed. SAILS, targets on discovering prominent system calls from applications, and is built on the top of conventional feature selection methods, such as mutual information, distinguishing feature selector and Galavotti–Sebastiani–Simi. These classical attribute selection methods are used as local feature selectors. Besides, a novel global feature selection method known as, weighted feature selection is proposed. Comprehensive analysis of the proposed feature selectors, is conducted with the traditional methods. SAILS results in improved values for evaluation metrics, compared to the conventional feature selection algorithms for distinct machine learning models, developed using Logistic Regression, CART, Random Forest, XGBoost and Deep Neural Networks. Our evaluations observe accuracies ranging between 95 and 99% for dropout rate and learning rate in the range 0.1–0.8 and 0.001–0.2, respectively. Finally, the security evaluation of malware classifiers on adversarial examples are thoroughly investigated. A decline in accuracy with adversarial examples is observed. Also, SAILS recall rate of classifier subjected to such examples estimate in the range of 24.79–92.2%. However, prior to the attack, the true positive rate obtained by the classifier is reported between 95.2 and 99.79%. The results suggest that the hackers can bypass detection, by discovering the classifier blind spots, on augmenting a small number of legitimate attributes.
ISSN:1386-7857
1573-7543
DOI:10.1007/s10586-019-03045-6