NUST Institutional Repository

Machine Learning Based QBE Word Spotting of Urdu Text

Show simple item record

dc.contributor.author Farooqui, Faiq Faizan
dc.date.accessioned 2023-08-27T05:37:07Z
dc.date.available 2023-08-27T05:37:07Z
dc.date.issued 2019
dc.identifier.other 172015
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/37599
dc.description Supervisor: Dr. Shahzad Younis en_US
dc.description.abstract With increased digitization of documents over the past decades, the task of word spotting has acquired much significance in the field of document analysis and recognition. Deep learning has revolutionized many fields and promises to make similar inroads in this field and improve performance for various document analysis tasks. This research presents a systems for the task of word spotting of Urdu text using effective feature extraction. The systems take ligature images of Urdu text and extract features to train on vtwo different learning models. For the purpose of feature extraction, HOG features and autoencoders have been used. The classifiers used in this study were SVM and LSTM models. The system has been tested on two separate data sets of printed and hand written Urdu text. The systems produced outstanding results when trained on the printed Urdu database. In the case of handwritten database however, the intra-class variation is too large which results in poor accuracy of the system. Hence, for the case of hand written text, the data was modified during preprocessing to create three different data sets to improve the performance of the system. This process improved the system’s performance significantly. As Convolutioanl Neural Networks are best suited for classifications with image inputs, a comparison of the re sults obtained in this study and the classification by CNN on both datasets is presented in the results section. The CNN architecture used for the comparison is VGG16. The research shows that the best results were obtained when LSTM was trained on HOG features of the ligature images. en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering and computer Science (SEECS), NUST en_US
dc.title Machine Learning Based QBE Word Spotting of Urdu Text en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [882]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account