NUST Institutional Repository

IMPROVING ANOMALY DETECTION PERFORMANCE USING INFORMATION THEORETIC AND MACHINE LEARNING TOOLS

Show simple item record

dc.contributor.author Ashfaq, Ayesha Binte
dc.date.accessioned 2020-11-04T11:02:00Z
dc.date.available 2020-11-04T11:02:00Z
dc.date.issued 2014
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/9829
dc.description Supervisor: Dr. Syed Ali Khayam en_US
dc.description.abstract Anomaly detection systems (ADSs) were proposed more than two decades ago and since then considerable research efforts have been vested in designing and evaluating these systems. However, accuracy in terms of detection and false alarm rates, has been a major limiting factor in the widespread deployment of these systems. Hence, in this thesis we (i) Propose and evaluate information theoretic techniques to improve the performance of existing general-purpose anomaly detection systems; (ii) Design and evaluate a novel and specific-purpose machine learning-based anomaly detection solution for bot detection; (iii) Stochastically model general-purpose anomaly detection systems and show that these systems are inherently susceptible to parameter estimation attacks; and (iv) Propose novel design philosophies to combat these attacks. To improve the performance of current general-purpose anomaly detection systems, we propose (i) a feature space slicing framework; and (ii) a multi-classifier ADS. The feature space slicing framework operates as a pre-processor, that segregates the feature instances at the input of an ADS. We provide statistical analysis of mixed traffic highlighting that there are two factors that limit the performance of current ADSs: high volume of benign features; and attack instances that exhibit strong similarity with benign feature instances. To mitigate these accuracy limiting factors, we propose a statistical information theoretic framework that segregates the ADS feature space iv into multiple subspaces before anomaly detection. Thorough evaluations on real-world traffic datasets show that considerable performance improvements can be achieved by judiciously segregating feature instances at the input of a general-purpose ADS. The multi-classifier ADS, on the other hand, defines a standard deviation normalized entropy-of-accuracy based post-processor that judiciously combines outputs of diverse general-purpose anomaly detection classifiers, thus building on their strengths and mitigating their weaknesses. Evaluations on diverse datasets show that the proposed technique provides significant improvements over existing techniques. During the course of this research, the threat landscape changed considerably with botnets emerging as the most potent threat. However, existing general-purpose anomaly detection systems are largely ineffective in detecting this evolving threat because botnets are distinctively different from their predecessors. Since botnets follow a somewhat invariant lifecycle, instead of pure behavior-based solutions, current bot detection tools employ the bot lifecycle for detection. However, these specific-purpose tools use rigid rule-based detection logic that falls short of providing acceptable accuracy with evolving botnet behavior [1]. Extending the design philosophy of this thesis, we propose a post-processing detection logic, for specific-purpose bot detection. The proposed post-processor models the high level bot lifecycle as a Bayesian network. Experimental evaluations on diverse real-world botnet traffic datasets show that the use of Bayesian inference based post-processor provides considerable performance improvements over existing approaches. Lastly, we stochastically model a few existing general-purpose anomaly detection systems and demonstrate that these systems are highly susceptible to parameter esv timation attacks. Since current day malware is becoming increasingly stealthy and difficult to mine in overwhelming volumes of benign traffic, we argue that anomaly detection systems need to be significantly redesigned to cope with the evolving threat landscape. To this end, we propose cryptographically-inspired and moving target based ADS design philosophies. The crypto-inspired ADS design aims at randomizing the learnt normal network profile while the moving target-based ADS design randomizes the feature space employed by an ADS for anomaly detection. We provide some preliminary evaluations that show that randomizing ADS parameters greatly improves the robustness of anomaly detection systems against parameter estimation attacks. en_US
dc.publisher SEECS, National University of Science and Technology, Islamabad. en_US
dc.subject Information Technology, MACHINE LEARNING TOOLS en_US
dc.title IMPROVING ANOMALY DETECTION PERFORMANCE USING INFORMATION THEORETIC AND MACHINE LEARNING TOOLS en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account