NUST Institutional Repository

Sindhi Ligature Recognition in Printed Text: A Large Scale Font Diverse Sindhi Ligature Recognition System

Show simple item record

dc.contributor.author Ali, Zeeshan
dc.date.accessioned 2023-08-02T11:04:40Z
dc.date.available 2023-08-02T11:04:40Z
dc.date.issued 2023
dc.identifier.other 319593
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/35440
dc.description Supervisor: Dr. Safdar Abbas Khan en_US
dc.description.abstract The advent of Deep Learning in Computer Vision has resulted in advancements in many domains of life encompassing a diverse set of fields. Object Character Recognition plays a vital role in the modern age of Artificial Intelligence. It is a challenging task, difficult to implement, and computationally expensive. Sindhi is a literature-rich language spoken by millions of people around the globe. It has an exuberance of preserved grammatical forms. There has been a significant development in OCR systems for English. Little work has been done on Arabic script. Most of the Sindhi literature uses the extended Perso-Arabic script. No benchmark datasets have been published to the best of our knowledge. Consequently no state-of-the-art Sindhi OCR models have been devised. This thesis attempts to fill this research gap by making the following contributions. We have extracted a set of 22,597 ligatures that are found in Sindhi literature. We present a synthesized benchmark dataset for Sindhi printed text recognition at ligature level. The dataset is font diverse, comprising of 256 unique fonts. Finally, we have setup a baseline neural network for Sindhi Ligature Recognition in printed text. It has achieved 91.85% test accuracy on the benchmark dataset. Our baseline can be used to build the complete pipeline of a Sindhi OCR that is font invariant. en_US
dc.language.iso en_US en_US
dc.publisher School of Electrical Engineering & Computer Sciences (SEECS), NUST en_US
dc.title Sindhi Ligature Recognition in Printed Text: A Large Scale Font Diverse Sindhi Ligature Recognition System en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [375]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account