CTF-All-In-One

8.19 DroidNative: Semantic-Based Detection of Android Native Code Malware

paper

What is your take-away message from this paper

The paper proposed DroidNative for detection of both bytecode and native code Android malware variants.

What are motivations for this work

native code

A recent study shows that 86% of the most popular Android applications contain native code.

current methods

the plethora of more sophisticated detectors making use of static analysis techniques to detect such variants operate only at the bytecode level, meaning that malware embedded in native code goes undetected.

What is the proposed solution

This paper introduces DroidNative, a malware detection system for Android that operates at the native code level and is able to detect malware in either bytecode or native code. DroidNative performs static analysis of the native code and focuses on patterns in the control flow that are not significantly impacted by obfuscations. DroidNative is not limited to only analyzing native code, it is also able to analyze bytecode by making use of the Android runtime (ART) to compile bytecode into native code suitable for analysis. The use of control flow with patterns enables DroidNative to detect smaller size malware, which allows DroidNative to reduce the size of a signature for optimizing the detection time without reducing the DR.

MAIL

DroidNative uses MAIL (Malware Analysis Intermediate Language) to provide an abstract representation of an assembly program, and that representation is used for malware analysis and detection.

img

Disassembler

Optimizer

Removing other instructions that are not required for malware analysis. DroidNative builds multiple, smaller, interwoven CFGs for a program instead of a single, large CFG.

MAIL Generation

The MAIL Generator translates an assembly program to a MAIL program.

Malware Detection

ACFG

A CFG is built for each function in the an- notated MAIL program, yielding the ACFGs.

img

SWOD

Each MAIL pattern is assigned a weight based on the SWOD that represents the differences between malware and benign samples’ MAIL patterns’ distributions.

img

What is the work’s evaluation of the proposed solution

Dataset

Our dataset for the experiments consists of total 2240 Android applications. Of these, 1240 are Android malware programs collected from two different resources and the other 1000 are benign programs containing Android 5.0 system programs, libraries and standard applications.

N-Fold Cross Validation

The authors use n-flod cross validation to estimate the performance and define the following evaluation metrics: DR, FPR, ROC, AUC.

img

What is your analysis of the identified problem, idea and evaluation

This is the first research effort to detect malware deal with the native code. It shows sperior results for the detection of Android native code and malware variants compared to the other research efforts and the commercial tools.

But there are some limitations:

What are the contributions

What are future directions for this research

To improve DroidNative’s resilient to such obfuscations, in the future we will use a threshold for pattern matching. We will also investigate other pattern matching techniques, such as a statement dependency graph or assigning one pattern to multiple statements of different type etc, to improve this resiliency.

What questions are you left with

There are many other programming languages (JavaScript/Python/…) can be used for Android app development. How to detect malware written in those languages?