Abstract:
In the AI and machine learning research field, adversarial machine learning(AML), a
technique that tries to deceive models using erroneous data, is becoming a major concern. By exploiting the inherent vulnerability of ML models’ data reliance, AML can be
used to generate adversarial attacks. Researches have shown that a small perturbation
in input image can create disastrous results for an autonomous car system e.g. miscalssifying stop sign as speed limit sign near school. To counter these adversarial attacks,
several defense mechanisms have been proposed. Some of the most prominent defenses
are adversarial training, pre-processing-based defenses, Generative Adversarial Networkbased defenses. However, most of these defenses are either computationally expensive
or become in-effective under the white-box threat model or against the decision-based
attacks (Adversarial attacks that exploit the final decision of the attack under black-box
settings). Therefore, there is a dire need to develop efficient defense mechanisms that
can effectively counter the attacks while maintaining the classification accuracy. In this
thesis, we propose to develop a computationally efficient and effective defense mechanism that effectively counters the score-based and decision-based adversarial attack
under black-box settings while maintaining the classification accuracy on clean images.