NUST Institutional Repository

Urdu Optical Character Recognition.

Show simple item record

dc.contributor.author Syed Wajih Ul Hassnain Shah, Supervised by Dr Hasan sajid
dc.date.accessioned 2021-09-08T07:34:28Z
dc.date.available 2021-09-08T07:34:28Z
dc.date.issued 2021
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/25884
dc.description.abstract Existing commercial software such as ABBY and Google vision API provides support for Arabic and Urdu text. Still, accuracies are low because of the writing style of a non-Latin text. OCR for Urdu started way back in 2003 when a system could recognise isolated Urdu Characters only, but with the increase of data and digitisation research interest towards Urdu OCR increased. Other motivations include the Urdu data explosion in financial and economic sectors, including printed and handwritten scanned documents. Optical Character Recognition for Urdu is challenging due to the fact that being a non-Latin script it has a cursive writing style. These challenges need to be solved in different phases of the OCR system. This thesis presents an Urdu Optical Character Recognition (OCR) system and a data generation and encoding technique that is useful to standardise data for optical character recognition. The proposed model consists of a four staged network, and the first stage normalises the image while the second stage and the third stage is used for feature extraction and sequence generation. The final stage is the prediction stage which is responsible for predicting digitised text present in an image. The proposed algorithm is compared against baseline implementations of widely adapted supervised deep learning methods. en_US
dc.language.iso en_US en_US
dc.publisher SMME en_US
dc.relation.ispartofseries SMME-TH-628;
dc.subject Optical Character Recognition (OCR), Deep Learning, Supervised Learning, Convolutional Neural Networks (CNN), Sequential Models, Connectionist temporal classification (CTC), Spatial Transformation Network (STN) en_US
dc.title Urdu Optical Character Recognition. en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [204]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account