Abstract:
Internet usage has become intensive during the last few decades; this has given rise to the use of email which is one of the fastest yet cheap modes of communication. The fast growing users of internet use email to easily communicate with anyone across the globe in few seconds. However the rise of email and internet users resulted in the striking increase of unsolicited bulk/spam emails.
Based on the study and research carried on this topic, several classifiers are studied and then ensemble model is proposed that incorporates majority voting based method over the classifiers and apply it to the task of filtering spam emails.
The first part of the thesis describes spam, its types and how it has affected the increasing number of people which created a need for reliable spam filters. This section further throws light upon different spam techniques used so far by the researchers with their limitations.
In the methodology section, the ensemble model is discussed in detail followed by results and comparison in the next section that practically validates the model with statistical data. The tool used for testing is Rapidminer version 5.3.015.