Comparison of deep  convolutional neural  architectures for handwritten  numerals

Arif, Ameera

DSpace Home
→
E-Theses
→
SEECS
→
Computer Science
→
MS
→
View Item

Comparison of deep convolutional neural architectures for handwritten numerals

Arif, Ameera

URI: http://10.250.8.41:8080/xmlui/handle/123456789/35165

Date: 2021

Abstract:

Urdu is a primitive and one of the most popular languages in South Asia. It has a rich history and worldwide appeal with it being spoken in more than 200 countries around the globe. But in terms of research and availability of data, it has been neglected for so long. People have built recognizers for the language but with their own specific set of data. So, with a dearth of data for research purposes we collect our own data for numerals and mix it with Persian numbers data. Urdu relies heavily on Arabic and Persian calligraphy for vocabulary, which in turn makes it a combination of loan words. Hence Urdu is thought to be an amalgamation of Arabic, Persian, Turkish and Sanskrit. Urdu, and Persian numbers are written on similar patterns with certain differences between some of their digits, so they are compatible for comparison purposes. Along with that English is the most widely spoken, written, understood language and also has huge datasets for research available. So, we employ Urdu and Persian handwritten numerals with English numerals to make a novel dataset that can be used for classification and recognition of all the three languages. We establish a unique way to exploit the similarities between these languages while keeping in view our main goal of reviving Urdu language’s importance in research field. We present a combination of Urdu, English, and Persian handwritten numbers to build deep learning-based models that can categorize the different numbers by understanding the subtle similarities between each other. Our novel dataset contains 9800 images of handwritten Urdu numerals written by over 200 individuals with their left and right hands in order to incorporate diversity in dataset. The popular Persian dataset by E. Kabir et al. and MNIST dataset for English numbers is incorporated to make a combined dataset of these three languages. We propose some versions of custom-built convolutional neural network to achieve remarkable accuracy in recognizing characters belonging to the proposed dataset. Along with our own proposed CNN, we use CNN architectures VGGNet, ResNet, GoogLeNet and Xception to achieve remarkable results. Our proposed models are powerful, yet simple, and obtain excellent performance when evaluated on our novel dataset.