SentinelDroid

An AI based Malware Detection App built with Detection Engine part implemented in TensorFlow and TFLite and the on-device Malware Detection part in Java.

Introduction

This is a project undertaken during my work at the Devices & Network Security Lab under National Center for Cyber Security (NCCS) at Air University, Islamabad. This project has two major contributions:

Design and Development of a CNN based on features extracted during a static analysis of the Android Apps
Deployment of the trained model into the Mobile Device for on-device detection of Malicious Android Apps.

Detection Engine

In this phase, we have used only static features extracted from Android apps. Static feature-based malware detection means that the Application is not run/executed on any device but the source code is extracted and observed for malicious patterns. There are multiple techniques/features to be monitored but Wei Wang [1] provided a brief summary of the static features and highlighted Permissions and Intent-filters (available within manifest file of apk package) to be most promising. Therefore, we monitored these features to train a Neural Network model for malware detection.

Dataset

Next step was identification of a comprehensive dataset that can be used to train the AI model. We considered the following properties in the dataset:

Publicly/easily available
Targeted for our features
Large enough to provide appropriate results
Should cover latest/recent malware as well

This search resulted in the CIC’s dataset named InvesAndMal2019 (release date: 2019) to be selected for this AI model training.

Link to dataset:

https://www.unb.ca/cic/datasets/invesandmal2019.html

AI Model Development

Targeted dataset has provided the results in their paper using Machine Learning techniques such as Random Forest, Decision Tree etc. We initially tried to re-train the same models; which was successful in the colab notebooks (Python based notebook and Google’s cloud service to experiment with AI models) but couldn’t be translated into Tensor Flow lite files (the files that contain the AI model information and is included into the app for prediction by providing an input). After a lot of failed attempts, we moved on to experiment with Neural Networks (using the Keras library), which was easy to use once the input (tensor) creation was understood. We initially experimented with approx. 300 features to train a simple 3 layered Multi Layered Perceptron (MLP); which was successfully loaded into the mobile application and experimented with multiple test inputs. Later, feature reduction activity (using chi-square technique) was performed, resulting in a dataset which contain 1024 features reduced from originally 8000+ features. These reduced features were then used to train a different AI Model that was Convolutional Neural Network (CNN). This was also successfully loaded into the mobile app. In parallel, the task of obtaining feature extraction at run-time was also completed, which is further discussed in next section.

Feature Extraction on Device

On-device feature extraction was required for an app to be predicted by the AI model. These features included the following:

Permissions:

Permissions are extracted from the manifest file by using the package manager and built in APIs. The details of the permission extraction mechanism is explained on Android’s developer guide. Link: https://developer.android.com/reference/android/content/pm/PackageManager#GET_PERMISSIONS

Intent-Filters:

Extraction of this features was relatively tough to implement. There doesn’t exist a built-in API to access the intent-filters from the manifest directly. Therefore, we had to resolve to the extracting the manifest file, then using the XML-parsers to extract intent-filters. We identified an open-source app that covered the extraction of manifest file and catered the XML parsing part, we included the intent filter extraction part into it ourselves and generated a list of intent-filters for a specific app. Link to Opensource App: [Not Available]

Front End / UI

The app’s front end is very simple, the app starts to display all the non-system apps into a recycler view as list items. User can click on any list item to calculate that app’s Malicious/Benign status and its confidence level. The functionality of the apps is birefly discussed later in the document.

Structure of the app

The app consists of the following four activities java files:

Main
List Apps
List Adapter
Malware Model

These function of each file is explained below:

Main Activity:

This is the launching pad into the list apps activity. This includes an AI model and some user interface elements that was used in previous experimentation and aren’t currently used except the button on lower left corner. This button is used to move into the List Apps Activity.

List Apps Activity:

This activity is used to host the recycler view (list generator item in Android) and gets the installed packages (initially all installed packages are listed then they are short-listed to only include non-system apps only). These installed packages are then handed over to the List Adapter class to generate the list.

List Adapter:

This class performs most of the functionality that includes:

Feature Extraction Permissions
Feature Extraction Intent Filters
Creation and update of the View holders (list items)

Note: This app doesn’t load the AI model and perform prediction. There was issue in placing the Load Model function into the list adapter therefore it was placed separately into the Malware Model Activity. This app after extracting features for an app (after the click) creates an input tensor that is handed over to the next activity to perform the prediction and acquire result.

Malware Model Activity

This activity mainly contains the following tasks:

Loading of the AI model
Input of the tensor to the loaded model
Perform prediction
Acquire predicted value and display it on screen with Benign/Malicious label

This activity provides user with the result/status of an app. Results: The result of the installed app is currently displayed on-click.

Appendix – A

[Other related information]

Model Viewer

Any TFLite model file can be viewed in Netron (not neutron) named (open-source) software. Link: https://github.com/lutzroeder/netron

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
.idea		.idea
AI Engine		AI Engine
app		app
gradle/wrapper		gradle/wrapper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Readme.txt		Readme.txt
Test Scenarios.txt		Test Scenarios.txt
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SentinelDroid

Introduction

Detection Engine

Dataset

Link to dataset:

AI Model Development

Feature Extraction on Device

Permissions:

Intent-Filters:

Front End / UI

Structure of the app

Main Activity:

List Apps Activity:

List Adapter:

Malware Model Activity

Appendix – A

Model Viewer

About

Releases

Packages

Languages

License

rameezrehman408/SentinelDroid

Folders and files

Latest commit

History

Repository files navigation

SentinelDroid

Introduction

Detection Engine

Dataset

Link to dataset:

AI Model Development

Feature Extraction on Device

Permissions:

Intent-Filters:

Front End / UI

Structure of the app

Main Activity:

List Apps Activity:

List Adapter:

Malware Model Activity

Appendix – A

Model Viewer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages