# Malware-classifier-PyTorch

**Repository Path**: dyjch/Malware-classifier-PyTorch

## Basic Information

- **Project Name**: Malware-classifier-PyTorch
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-02-20
- **Last Updated**: 2021-02-20

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Malware Classifier with PyTorch
Malware Classifier is a Rest API made with Django that classifies apk files between legitimate and malware. 
<img align="right" width="100" height="120" src="https://zupimages.net/up/19/01/qzz3.png">
After finishing the Udacity course : **Deep learning with PyTorch**. We were told to make a side project to practice what we've learnt =) 


## Motivation and how I worked 
![Function](https://zupimages.net/up/19/01/vdf2.gif)

Legitimate           |  Malware
:-------------------------:|:-------------------------:
![](https://zupimages.net/up/19/01/p2m7.jpg)  |  ![](https://zupimages.net/up/19/01/f591.jpg)


At first my motivation was to test myself on the acquired knowledge on the course and apply it on real examples and figure out if I can help and improve some ideas to produce a full well developed and helpful application. 
One of the side projects I made was this malware classifier. I always wondered how antivirus were made, I'm always surprised on the quantity of characteristics that we have to check to detect if a program is malicious or not, and I thought I can use deep learning to classify a program as legitimate or malicious.
After searching a bit I found some articles talking about classifying .apk files between malware and legitimate and without further reading my brain begins to think about several features that I can use to detect if It's a malware or not depending on the manifest file (I used to develop android applications so it was a fun experience to try to detect if an application respects the permissions). 

I found a cool XML parser **AXMLPrinter2** : https://github.com/flyfei/ApkDecompile/tree/master/Tools that extract the manifest from an apk and make it readable and I extracted all the permission that the app requires and those are my features (In the future the application may also extract the features requires from the manifest). 

Once I get how to extract the features from an apk, I searched for a dataset and found this one : https://www.unb.ca/cic/datasets/android-adware.html

Data ready .. Jump to the fun part ! The training one I chose to work on a fully connected Neural Network with (1 - 2 - 3 hidden layers) and played with the hyperparameters and the optimizer to obtain a Training accuracy of 98.3% and a Test accuracy of 95% on 1500 .apk in total. 

![Training Accuracy and Loss ](https://zupimages.net/up/19/01/fza9.png)

I started deploying it on a web app and got it work and the next step is to deploy it as an Android App. 

## Requirements 
You'll find on the requirements.txt all the packages needed to run the application. 

## Test the application 
Clone the application and run the following commands : 

1. Install all the requirements of the app (listed on the file : requirements.txt)  ``` pip install -r requirements.txt ```

2. Launch the django web app  ``` python manage.py runserver ```

3. Check : http://127.0.0.1:8000

4. On the test folder there are two .apk files one malware and another one legitimate. Chose one of them and upload it to the app =) 

5. Test your real .apk files !