ELF Miner Download - ELF Miner Source code download

ELF Miner

Other categories

1.0.0

Download

ELF Miner

This is an approximate implementation of the ELF Miner framework as described in this paper. The difference being that this model aims to classify a given ELF as Malware or Benign but does not classify the Malware into the the five types as described in the paper. This is because of the limitations of the dataset used.

Requirements

Python 2.7
Java 8.0+
WEKA-3.6 toolkit
KEEL toolkit

Usage

The ELF files to be analyzed must be put into the folder elfs.
pip install -r requirements.txt
From the root of the project, run - python run_system.py

This prints the predicted class (Malware or Benign) for each ELF file in the same order as in the generated final.csv in the elfs folder.

Steps involved

Run the ELF Miner framework for feature extraction on the given ELF files. The details of the dataset and the features that are extracted are explained in much detail in the presentation linked below. A total of 343 features are initially 342 features.
Perform some postprocessing on the CSV to convert values for certain attributes to a form suitable for applying Machine Learning.
Perform Feature Selection on the CSV file. The features to remove were determined using Information Gain. For this we used WEKA. The features to remove are given in feature_selection/weka_features_toremove.txt. These are the ones which have 0 Information Gain. This reduces the number of features to 147.
Use the saved models (after training multiple classifiers using WEKA) to make predictions. The saved models and their details are present in models folder.

Two classes of classifiers have been used in the paper -

Non-Evolutionary Classifiers
- JRip
- J48
- PART
- Random Forest
Evolutionary Classifiers
- UCS
- XCS
- GAssist-Adi

For the Non-Evolutionary Classifiers we have used the WEKA toolkit and for the Evolutionary Classifiers we have used the KEEL toolkit. The accuracy of each of these classifiers (on 70-30 split of train-test split) is given in keel/results/results.txt.

However, the end-to-end system incorporates a voting classifier based only on the Non-Evolutionary classifiers, due to the availability of WEKA's Java API.