Detecting Internet of Things Malware on Evidence Generation
Malware has been a real threat to Internet of Things (IoT). Although commercial antivirus solutions can detect malware files and provide label information indicating malware types or families, no clear evidence explaining the detection is provided. Therefore, even security experts using the antiviru...
Saved in:
Published in | IEEE internet of things journal Vol. 11; no. 22; pp. 36950 - 36964 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
15.11.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 2327-4662 2327-4662 |
DOI | 10.1109/JIOT.2024.3439528 |
Cover
Summary: | Malware has been a real threat to Internet of Things (IoT). Although commercial antivirus solutions can detect malware files and provide label information indicating malware types or families, no clear evidence explaining the detection is provided. Therefore, even security experts using the antivirus solutions do not know why some files are reported malicious and they hesitate to take an immediate action. In this article, we study this problem from the viewpoint of antivirus solution users instead of product developers or sellers. We present a new data-driven scheme that can automatically generate a set of readable common strings from the IoT malware files as a detection evidence. These generated string signatures not only provide a clear detection evidence for suspicious files but can also be used as unique high-precision detection criteria. The new data-driven scheme divides any long evasive string embedded in malware files into short n-grams to mitigate the detection evasion, and a limited number of n-grams are selected as representative n-grams on a bipartite graph that improves the efficiency and accuracy of clustering. A set of n-grams per cluster, which plays the role of an unique detection evidence is generated. Through experiments with the real malware data sets, including the public data sets for the experimental reproducibility, we confirm that the new data-driven scheme not only detects malware files as accurately as the current state-of-the-art (SOTA), especially no benign files mistakenly considered as malicious but also provides readable strings as a detection evidence, which has not been achieved by the previous work. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 2327-4662 2327-4662 |
DOI: | 10.1109/JIOT.2024.3439528 |