Abstract:
Urdu is a primitive and one of the most popular languages in South Asia. It has a rich history and
worldwide appeal with it being spoken in more than 200 countries around the globe. But in terms
of research and availability of data, it has been neglected for so long. People have built recognizers
for the language but with their own specific set of data. So, with a dearth of data for research
purposes we collect our own data for numerals and mix it with Persian numbers data. Urdu relies
heavily on Arabic and Persian calligraphy for vocabulary, which in turn makes it a combination of
loan words. Hence Urdu is thought to be an amalgamation of Arabic, Persian, Turkish and Sanskrit.
Urdu, and Persian numbers are written on similar patterns with certain differences between some
of their digits, so they are compatible for comparison purposes. Along with that English is the most
widely spoken, written, understood language and also has huge datasets for research available. So,
we employ Urdu and Persian handwritten numerals with English numerals to make a novel dataset
that can be used for classification and recognition of all the three languages. We establish a unique
way to exploit the similarities between these languages while keeping in view our main goal of
reviving Urdu language’s importance in research field.
We present a combination of Urdu, English, and Persian handwritten numbers to build deep
learning-based models that can categorize the different numbers by understanding the subtle
similarities between each other. Our novel dataset contains 9800 images of handwritten Urdu
numerals written by over 200 individuals with their left and right hands in order to incorporate
diversity in dataset. The popular Persian dataset by E. Kabir et al. and MNIST dataset for English
numbers is incorporated to make a combined dataset of these three languages. We propose some
versions of custom-built convolutional neural network to achieve remarkable accuracy in
recognizing characters belonging to the proposed dataset. Along with our own proposed CNN, we
use CNN architectures VGGNet, ResNet, GoogLeNet and Xception to achieve remarkable results.
Our proposed models are powerful, yet simple, and obtain excellent performance when evaluated
on our novel dataset.