Structural analysis of packing schemes for extracting hidden codes in mobile malware
© The Author(s) 2016
Received: 2 April 2016
Accepted: 6 September 2016
Published: 15 September 2016
In the Internet of Things service environment where all things are connected, mobile devices will become an extremely important medium linking together things with built-in heterogeneous communication functions. If a mobile device is exposed to hacking in this context, a security threat arises where all things linked to the device become targets of cyber hacking; therefore, greater emphasis will be placed on the demand for swift mobile malware detection and countermeasures. Such mobile malware applies advanced code-hiding schemes to ensure that the part of the code that executes malicious behavior is not detected by an anti-virus software. In order to detect mobile malware, we must first conduct structural analysis of their code-hiding schemes.
In this paper, we analyze the structure of the two representative Android-based code-hiding tools, Bangcle and DexProtector, and then introduce a method and procedure for extracting the hidden original code. We also present experimental results of applying these tools on sample malicious codes.
KeywordsRepackaging attack Android app security Mobile code hiding
In the present advent of the Internet of Things (IoT)  era, communication functions such as Wi-Fi and Bluetooth are embedded in all things, so that real-time connection between things is made possible. While connecting various IoT devices with a single communication standard is practically very difficult, individual users already own mobile devices that fuse together multiple communication modules and companies provide a bring your own device (BYOD) work environment; hence, mobile devices will play a key role in the proliferation of IoT services.
However, in the case of mobile devices, when this coupling medium for all things is exposed to hacking, there is the threat that all the connected things will also become targets for cyber hacking. If there is a cyber attack aimed at specific connected devices, the security vulnerability of mobile devices can bring about side effects that may even have a major adverse effect on a different industry . For example, in smart car service, cars connected through a mobile device can become infected with a virus and, if an infected car has problems operating, studies show that, in the worst case, major car accidents could result . Consequently, in the IoT, it is a task of utmost importance and urgency to ensure that mobile devices cannot be infected easily.
Unfortunately, the number of mobile malware codes is increasing each year with evolution into various types. Recently, there has even been an emergence of ransomware, where documents in an infected device are secretly encrypted to demand payment. The majority of the mobile malware targets the Android platform. Android malware can be generally divided into two camps: those that have been designed to impersonate a normal app and those that have been designed to hide their malicious behavior. In the case where malware is designed to impersonate a normal app, it impersonates apps that users most frequently use such as apps related to the smartphone theme, finance, and games, so that a similar app is designed to secretly execute malicious behavior such as account charging and game information extraction, without the user’s realization. To prevent an anti-virus from detecting such malware, attackers apply methods such as encryption, packing, and obfuscation  on the main code related to malicious behavior and then distribute it. To bring about speedy malware detection and response, understanding the structure of such code-hiding methods in order to extract the original code related to malicious behavior must first be achieved.
Thus, in this paper, we analyze the fundamentals behind the main code-hiding schemes used by mobile malware and, by running tests, empirically present methods for extracting the original code hidden in the malware through reverse engineering analysis. The tests target the malware that applies the main code-hiding tools, Bangcle  and DexProtector , and we analyze in detail the method of extracting the actual code responsible for malicious behavior from the packed Android application package (APK) files generated by these two analysis tools.
This paper is organized as follows. Section 2 presents background knowledge. Sections 3 and 4 describe the process of extracting the original code by conducting reverse engineering analysis of the structures of Bangcle’s packing scheme and DexProtector’s class encryption scheme. Section 5 presents results from experiments using sample apps, and Section 6 draws conclusions.
2 Related works
2.1 Repackaging attacks
The cause of various problems, including malware distribution in the Android environment is as follows. The Android platform, also labeled the Android Open Source Project (AOSP) , promotes broad openness. While Androids possess a high market share for the smartphones because of this, various security problems have arisen owing to the app structure and signing method devised to provide this openness .
Android apps are, by default, built using the Java language and are generated in the APK file format [9, 10]. Because at this level in app development the developer distributes code by self-signing using jarsigner, an attacker can distribute code by re-signing using the attacker’s private key rather than the developer’s signature . An attacker can use this point of weakness to insert attack code into a normal app, repackage it, and then distribute it. This is called a repackaging attack [12, 13].
2.2 Mobile code packing
Packing is a method used when one wishes to hide the code’s structure before the code is run. Code packed by techniques such as encryption and compression is not revealed until execution; then, at each run time, the packed code is unpacked through dynamic loading and executed. Code exposure is minimized through static analysis. Such packing methods are used mainly so that malware such as viruses or worms are not detected by an anti-virus; however, recently, normal programs have also utilized this method in the area of copyright protection to protect important code logic .
2.3 Mobile code reverse engineering
Reverse engineering methods for mobile code can be grouped roughly into static analysis methods or dynamic analysis methods. In static analysis, the method uses a disassembler or decompiler. Disassemblers targeting the Android execution file, called Dalvik Executable (DEX), include baksmali , dedexer , and apktoolkit , and using these programs, the DEX file is converted into a smali file , so that Dalvik virtual machine (DVM) bytecode  can be analyzed in units of commands. For decompilers, dex2jar , jad.jeb , etc. have been made public, and they also provide the function of restoring a DEX file into Java source code. Static analysis of the app is done through APK extraction using ADB, DEX conversion to JAR using dex2jar, and JAR file analysis using a Java decompiler while static analysis of so library is carried out using IDA  with Hex-Ray. We can investigate which functions and libraries are used through static analysis.
Dynamic analysis enables accurate observation of parameters and return values of variables and functions that are difficult to identify using static analysis, thus allowing effective analysis of the app. The main Android dynamic analysis tools are DroidScope , AppUse , and DroidBOX , but these run in sandbox-based emulator form. When an app is run in the emulator, it either analyzes the commands executed in the DVM or analyzes the APIs being used and provides the user with activity information needed for app analysis. Furthermore, IDA and GDB  are used not only for static analysis but also frequently for dynamic analysis.
3 Structural analysis on Bangcle
List of files added by Bangcle
Classes added and used by Bangcle
Interfacing with libraries
Entry point of the packed app
Loading the original app
Loading the original app
In this way, because the anti-debugging function is already applied with Bangcle, analysis methods that use existing debuggers do not work, and a different method must be used for analysis.
3.4 Code extraction
4 Structural analysis on DexProtector
First, the classes.dex of the original APK file is encrypted to generate the identically named (encrypted) classes.dex and save it in the assets folder. During this time, both the classes.dex file for decrypting and executing the (encrypted) classes.dex file and the AndroidManifest.xml file updated with the modified file information are saved, and the encrypted APK file is generated.
4.3 Code extraction
In the case of normal execution, the new.apk file is generated twice and then loaded. However, when the application is executed normally, the new.apk cannot be found because the new.apk file is deleted after loading. Therefore, to incapacitate DexProtector’s class encryption scheme, the new.apk file which is generated in real time must be obtained before deletion.
Because Android apps are generated initially from the early Zygote process, we go about attempting to debug during the Zygote process. Using the follow-fork-mode child option in the debugger, we debug the child process generated by Zygote.
In this section, we describe experimental results for determining whether code extraction schemes can be used successfully for analyzing malware samples that apply Bangcle and DexProtector.
5.1 Target app selection
First, in the case of determining Bangcle, when decompiling the APK file, if package names such as com.secapk.wrapper, com.secneo.guard, and com.bangc-le.protect are included or if libexec.so, libmain.so, bangcle_classes.jar, etc. are found in the assets folder, Bangcle was determined to have been applied. In the case of DexProtector, because it is difficult to determine its use with only the results generated from decompiling the file, the Application class is included. It was determined that DexProtector is applied based on the log record showing whether, when executed, the DexOpt process optimizes an identical named apk file twice.
Based on the above criteria, we were able to determine seven malware apps that applied Bangcle and one malware app that applied DexProtector, thus selecting a total of eight apps as the subjects for our experiment.
5.2 Experimental setup
To extract code from malware that applied Bangcle and DexProtector, we experimented in the environment described as follows. Because an app using Bangcle only needs to be repackaged to be executed, we ran the app on an Android version 4.4.4 device that had not been rooted. We determined that the Bangcle version information was given by the numbers assigned with the VERSION_NAME variables in the Util class. Because a debugger needs to be used for apps that have applied DexProtector, we used a rooted device to obtain higher privileges than the app process. For the debugger, we used a GDB built for ARM use. Finally, for decompiling and repackaging, we used apktool version 2.0.3.
5.3 Experimental results
The analyzed apps were applied with various versions of Bangcle, from 1.0 to 8.5.12. However, all of the versions were able to extract the original codes. Likewise, it was possible to extract the original codes from DexProtector.
The analysis results on packed and unpacked applications with various analysis engines
Secapk.E potentially unsafe
Android Dowgin (PUA)
In this paper, we present the experimental results of a reverse engineering analysis conducted regarding code-hiding methods applied in Android-based malware. As code-hiding tools in the experiment, we used Bangcle, the most commonly used packer in the Android market, and DexProtector, the highest-performing binary code protector, as subjects. Through structural analysis, we were able to identify characteristics and fundamentals of the shielding method for each tool and were successful in extracting the original code that causes malicious behavior from all of the tested malicious apps. We predict that such an analysis method will be used as a foundational technique to quickly detect and counteract mobile malware, which is the core security risk factor in the proliferation of IoT service.
It should be noted that, if the shielding methods analyzed in this paper are conversely used to protect code in normal apps, they could be used to act on the apps’ weaknesses. Therefore, further investigation is required on secure code-hiding methods to prevent such reverse engineering.
1 Note that we speculate that the reason for recompression after loading the original classes.dex is to prevent reverse engineering analysis of the folder saved in the temporary folder.
This research was supported by the Global Research Laboratory (GRL) program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT, and Future Planning (NRF-2014K1A1A2043029).
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- J Gubbi, R Buyya, S Marusic, M Palaniswami, Internet of things (iot): A vision, architectural elements, and future directions. Future Generation Comput. Syst.29(7), 1645–1660 (2013).View ArticleGoogle Scholar
- A Kitana, I Traore, I Woungang, Impact study of a mobile botnet over LTE networks. J. Internet Serv. Inform. Secur.6(2), 1–22 (2016).Google Scholar
- Hacking accident. https://blog.kaspersky.com/tesla-s-hacked-and-patched/9516/. Accessed 21 Jan 2016.
- J Park, H Kim, Y Jeong, S-J Cho, S Han, M Park, Effects of code obfuscation on android app similarity analysis. J. Wireless Mobile Netw. Ubiquitous Comput. Dependable Appl. 6(4), 86–98 (2015).Google Scholar
- Bangcle. http://www.bangcle.com.
- DexProtector. https://dexprotector.com/.
- A Design, Android open source project (2012). https://developer.android.com/design/index.html. Accessed 28 Jan 2016.
- B Rashidi, C Fung, A survey of android security threats and defenses. J. Wireless Mobile Netw. Ubiquitous Comput. Dependable Appl. 6:, 4–10 (2015).Google Scholar
- APK file. https://developer.android.com/tools/building/index.html.
- D Barrera, J Clark, D McCarney, PC van Oorschot, in Proceedings of the Second ACM Workshop on Security and Privacy in Smartphones and Mobile Devices. Understanding and improving app installation security mechanisms through empirical analysis of android (ACMNew York, 2012), pp. 81–92.View ArticleGoogle Scholar
- W Enck, D Octeau, P McDaniel, S Chaudhuri, in Proceedings of the 20th USENIX Security Symposium (USENIX Security 11). A study of android application security (USENIXBerkeley, 2011), pp. 21–21.Google Scholar
- J-H Jung, JY Kim, H-C Lee, JH Yi, Repackaging attack on android banking applications and its countermeasures. Wireless Pers. Commun. 73(4), 1421–1437 (2013).View ArticleGoogle Scholar
- SW Park, JH Yi, Multiple device login attacks and countermeasures of mobile VoIP apps on android. J. Internet Serv. Inform. Secur.4(4), 115–126 (2014).Google Scholar
- C Linn, S Debray, in Proceedings of the 10th ACM Conference on Computer and Communications Security. Obfuscation of executable code to improve resistance to static disassembly (ACMNew York, 2003), pp. 290–299.View ArticleGoogle Scholar
- baksmali. http://code.google.com/p/smali/. Accessed 13 Feb 2016.
- dedexer. http://sourceforge.net/projects/dedexer/. Accessed 3 Mar 2016.
- apktoolkit. http://ibotpeaches.github.io/Apktool/. Accessed 8 Mar 2016.
- smali. https://code.google.com/p/smali/. Accessed 15 Feb 2016.
- Bytecode. https://source.android.com/devices/tech/dalvik/dalvik-bytecode.html.
- dex, 2jar. http://code.google.com/p/dex2jar/. Accessed 28 Feb 2016.
- JEB. http://www.android-decompiler.com/. Accessed 24 Mar 2016.
- IDA. http://www.hex-rays.com. Accessed 27 Mar 2016.
- LK Yan, H Yin, in Presented as Part of the 21st USENIX Security Symposium (USENIX Security 12). Droidscope: seamlessly reconstructing the OS and Dalvik semantic views for dynamic android malware analysis (Berkeley, 2012), pp. 569–584.Google Scholar
- AppUse. https://appsec-labs.com/AppUse. Accessed 18 Feb 2016.
- P Lantz, A Desnos, K Yang, DroidBox: Android application sandbox (2012). https://code.google.com/archive/p/droidbox/. Accessed 2 Mar 2016.
- R Stallman, R Pesch, S Shebs, et al, Debugging with GDB. https://sourceware.org/gdb/onlinedocs/gdb. Accessed 24 Feb 2016.
- H Cho, J Lim, H Kim, JH Yi, Anti-debugging scheme for protecting mobile apps on android platform. J. Supercomputing. 72(1), 232–246 (2016).View ArticleGoogle Scholar
- R Yu, in Proceedings of the Virus Bulletin Conference (VB’14). Android packer facing the challenges, building solutions (Abingdon, 2014), pp. 266–275.Google Scholar
- D Kim, J Kwak, J Ryou, Dwroiddump: Executable code extraction from android applications for malware analysis. Int J Distrib Sensor Netw. 11(9) (2015). Article ID: 379682.Google Scholar
- Contagio. http://contagiodump.blogspot.kr/. Accessed 31 Mar 2016.
- VirusShare. https://virusshare.com/. Accessed 31 Mar 2016.