Abstract:
Many Malware detection systems these days are using signature based
techniques to detect malwares and viruses. This requires regular updates of
signatures because viruses can be recompiled using packing so that their
signature is changed and they are not detected anymore by the already
existing database of signatures. Moreover the zero day or new infected files
are not detected by these signature based AVs and their signature is
generated after they have done their damage and become famous. Hence it
becomes very important for a user to constantly update his antivirus. To
overcome these problems, we have proposed a solution based on Artificial
Intelligence techniques of testing and training and then generating a decision
tree to classify a Portable Executable file as Malicious or Benign. So clients
will not require frequent updates and probability of detecting zero day
infections will rise abruptly.
Our project is based on implementing data mining algorithms mainly C4.5
Decision Tree learner and applying this algorithm on custom generated
dataset. We have generated a dataset on the basis of already known
malicious executable files. A C4.5 decision tree is generated based on the
generated dataset and the unknown executables are passed through the tree
to classify the executable as a malicious or a benign file.
The purpose is to get rid of the manual signature based Malware detection
systems that require constant updated signatures and making systems
artificially immune to unknown and zero day malicious executables.