dc.description.abstract |
Optical Character Recognition (OCR) is the recognition of handwritten or printed
text for various digital processing tasks. It is an important area of research in the
field of image processing, natural language processing and artificial intelligence.
Many real world applications, for example price tag scanners, online translators
and text to speech converters are based on OCRs. The research and work for
English OCR has been quite remarkable. Many robust OCR systems have been
developed for English language. The research on Urdu OCR is quite recent and
till date, no sophisticated Urdu OCR system exists. Urdu is the national language
of Pakistan and is spoken by over 300 million people around the world. Owing to
this importance, there is a need to develop a method for recognition of Urdu script.
The lack of attention to Urdu OCR is due to the complexity of this language. It is a
highly cursive and context sensitive language. A single word has numerous struc tural variations, which make it difficult to be recognized. Due to these challenges,
no benchmark dataset for Urdu could be developed. The focus of our research is
to develop an effective method for recognition of Urdu script. In our proposed
method, we developed an Urdu ligature dataset named CEFAR dataset and used
deep learning to recognize these ligatures. The dataset contained exhaustive com binations of Urdu characters of length 2 and 3. These ligatures were in the form
of images divided into 3 parts; 2 and 3 character ligature sets separately and the
third part contained ligatures of both 2 and 3 characters. The aim was to train a
deep learning model for recognizing these ligatures. The main challenge associ ated with this data was its high number of classes and low intra-class variability.
In our proposed method, we developed a novel technique to solve this problem by
using data augmentation to increase the number of representative samples within
each ligature class and then applied class redefinition for class reduction. Data
augmentation was followed by ligature recognition using Recurrent Neural Net works. RNNs were employed for classification of these ligatures. A 34 layer
recurrent neural network model was used with a final fully connected layer for
classification. Three separate models were trained and evaluated for the three lig ature sets. All the three ligature’s training models gave remarkable performance.
The 2 character ligature set gave an accuracy of 96.4%, 99.7% accuracy on three character ligature set and 97.5% accuracy on the combined ligature set of 2 and
3 characters. The overall performance considered for the recognition system was
the accuracy of the model over the combined 2 and 3 character ligature dataset,
which was 97.5%. Our model performed brilliantly well than the existing deep
learning methods for Urdu ligature recognition. The excellent classification accu racy of our deep learning model makes this research play an effective role in not
only building a benchmark Urdu dataset but also its classification. In future, this
research could be extended for development of classification systems for ligatures
of lengths greater than three. |
en_US |