Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild

Program code is a valuable asset to its owner. Due to the easy-to-reverse nature of Java, code protection for Android apps is of particular importance. To this end, code obfuscation is widely utilized by both legitimate app developers and malware authors, which complicates the representation of sour...

Full description

Saved in:
Bibliographic Details
Published inSecurity and Privacy in Communication Networks Vol. 254; pp. 172 - 192
Main Authors Dong, Shuaike, Li, Menghao, Diao, Wenrui, Liu, Xiangyu, Liu, Jian, Li, Zhou, Xu, Fenghao, Chen, Kai, Wang, XiaoFeng, Zhang, Kehuan
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2018
Springer International Publishing
SeriesLecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
Subjects
Online AccessGet full text
ISBN9783030017002
3030017001
ISSN1867-8211
1867-822X
DOI10.1007/978-3-030-01701-9_10

Cover

Loading…
More Information
Summary:Program code is a valuable asset to its owner. Due to the easy-to-reverse nature of Java, code protection for Android apps is of particular importance. To this end, code obfuscation is widely utilized by both legitimate app developers and malware authors, which complicates the representation of source code or machine code in order to hinder the manual investigation and code analysis. Despite many previous studies focusing on the obfuscation techniques, however, our knowledge of how obfuscation is applied by real-world developers is still limited. In this paper, we seek to better understand Android obfuscation and depict a holistic view of the usage of obfuscation through a large-scale investigation in the wild. In particular, we focus on three popular obfuscation approaches: identifier renaming, string encryption and Java reflection. To obtain the meaningful statistical results, we designed efficient and lightweight detection models for each obfuscation technique and applied them to our massive APK datasets (collected from Google Play, multiple third-party markets, and malware databases). We have learned several interesting facts from the result. For example, more apps on third-party markets than malware use identifier renaming, and malware authors use string encryption more frequently. We are also interested in the explanation of each finding. Therefore we carry out in-depth code analysis on some Android apps after sampling. We believe our study will help developers select the most suitable obfuscation approach, and in the meantime help researchers improve code analysis systems in the right direction.
ISBN:9783030017002
3030017001
ISSN:1867-8211
1867-822X
DOI:10.1007/978-3-030-01701-9_10