Which Patterns GANs are Unable to Reproduce?

Ahmed, Jawad

DSpace Home
→
E-Theses
→
SEECS
→
Electrical Engineering
→
MS
→
View Item

Which Patterns GANs are Unable to Reproduce?

Ahmed, Jawad

URI: http://10.250.8.41:8080/xmlui/handle/123456789/37591

Date: 2021

Abstract:

Most challenging problem that is faced while applying deep learning solutions to any problem is to find large amount of data. A recent breakthrough that is being looked as a solution to this problem is, GANs (Generative adversarial networks), that produce new data. GANs have many great and wonderful applications but the most important one is to augment data for many different types of problems where getting enough amount of data is a challenge. This is a great application and people have started using this technique in many different type areas. But relying too much on GANs to augment data, can be catastrophic. That is what we are going to prove in this work. The problem we have chosen to prove this point is Security of text CAPTCHAs from deep learning attacks. For attacking a CAPTCHA scheme, the problem is to get enough labelled data. In recent state-of-the-art work, various CAPTCHA schemes have been broken by using GANs to produce large amount of augmented data and then using this data to train CAPTCHA solvers. But in all these works, limitation of GANs to learn has been overlooked. In this work we are going to prove that given enough random features to the data, GANs can fail to learn and hence start producing garbage outputs which starts worsening the efficiency of the model rather than improving it. In this work, we develop new features for text CAPTCHAs that induce huge randomness to the CAPTCHA dataset and hence make it difficult for GAN to learn. We use state of the art GAN named Pix2Pix in this work to augment CAPTCHA dataset. We test accuracy of vari ous CAPTCHA solvers on our CAPTCHAs with and without using Pix2Pix GAN and show that efficiency of CAPTCHA solvers significantly drops when GAN is used. We also develop features for CAPTCHAs that make it difficult to solve for deep learning detectors even without using GAN and hence propose a CAPTCHA scheme that is secure from deep learning attacks.