Robust Feature Extraction AI For Malware Detection and Threat Identification

Zuberi, Hafiz Talha Arif

DSpace Home
→
E-Theses
→
SINES
→
Computation Science & Engineering
→
MS
→
View Item

Robust Feature Extraction AI For Malware Detection and Threat Identification

Zuberi, Hafiz Talha Arif

URI: http://10.250.8.41:8080/xmlui/handle/123456789/46816

Date: 2024

Abstract:

Cybersecurity threats continue to rise in complexity and scale. This work proposed the robust feature extraction and machine learning techniques for the detection and identification of malware using a private dataset comprising MS-Office and Portable Executable (PE) files, which was initially unlabelled. Robust feature extraction methods were employed. The integration of robust feature extraction via the SCORE framework was pivotal in ensuring the models' reliability and performance under adversarial conditions. To address the challenge of data imbalance, SMOTE resampling was applied. Multiple machine learning models, including K-Nearest Neighbours (KNN), Random Forest (RF), Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), and a custom Convolutional Neural Network (CNN), were fine-tuned for both malware detection (binary classification) and threat identification (multi-class classification). The models were evaluated using different performance metrics. Additionally, K-fold and leave-oneout cross-validation were employed to improve robustness, also resource and time tracking was recorded. The research achieved state-of-the-art results, with significant success in identifying obfuscated and adversarial modified malware. To further evaluate the robustness of our models, we used independent validation. This additional validation provided strong evidence of the models’ generalization capabilities and resilience to unseen malware samples.

Show full item record