This is an approximate implementation of the ELF Miner framework as described in this paper. The difference being that this model aims to classify a given ELF as Malware or Benign but does not classify the Malware into the the five types as described in the paper. This is because of the limitations of the dataset used.
pip install -r requirements.txtpython run_system.pyThis prints the predicted class (Malware or Benign) for each ELF file in the same order as in the generated final.csv in the elfs folder.
feature_selection/weka_features_toremove.txt. These are the ones which have 0 Information Gain. This reduces the number of features to 147.models folder.Two classes of classifiers have been used in the paper -
For the Non-Evolutionary Classifiers we have used the WEKA toolkit and for the Evolutionary Classifiers we have used the KEEL toolkit. The accuracy of each of these classifiers (on 70-30 split of train-test split) is given in keel/results/results.txt.
However, the end-to-end system incorporates a voting classifier based only on the Non-Evolutionary classifiers, due to the availability of WEKA's Java API.
If you find any issues or bugs, feel free to open an issue or open a pull request if you wish to make an improvement.