NUST Institutional Repository

ExtractID: Deep Information Extraction from Identity Documents

Show simple item record

dc.contributor.author ZAIN UL ABIDIN, SYED
dc.date.accessioned 2024-07-04T10:02:43Z
dc.date.available 2024-07-04T10:02:43Z
dc.date.issued 2024
dc.identifier.other 329793
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/44539
dc.description Supervisor: Dr. Hafsa Iqbal en_US
dc.description.abstract Efficient extraction of information from identification documents, such as Computerized National Identity Cards (CNICs), is a pivotal aspect in modern document analysis and information retrieval systems. Traditional Optical Character Recognition (OCR) techniques often fall short in handling diverse challenges posed by real-world scenarios, including blurred images, varying illumination conditions, and complex backgrounds. This thesis presents an innovative approach leveraging an OCR-free algorithm known as "Donut" with pre-processing and optimizing techniques to enhance the accuracy and robustness of information extraction tasks. The study initiates with localization task utilizing YOLOv5 for detection of text, coupled with OCR-based recognition and extraction using Tesseract. Recognizing the limitations of OCR techniques, the research transitions to the OCR-free approach, preparing a self-annotated dataset of CNICs en coded in JSON lines text format. The proposed methodology involves dataset pre-processing and augmentation techniques for training, encompassing random crop, random rotate, random brightness-contrast adjustments, and Gaussian noise injection. The Donut model configuration is detailed, and the model is optimized in terms of memory, emphasizing its adaptability to handle various challenges in visual data, including blurred, dark, bright, and noisy images. Notably, the model exhibits a remarkable accuracy of 99.96% with an F1 score of 99.46% on test data with our proposed pipeline, showcasing its robust performance in real-world conditions. Also, HTML bio-data forms are prepared and trained with the same pipeline for Donut model, exhibiting consistent performance for test data. To facilitate practical implementation, a Django API is developed for seamless testing of images, demonstrating the model’s effectiveness in real-time applications. The findings of this research underscore the significance of OCR-free approaches, specifically the Donut algorithm, in overcoming the limitations of traditional OCR techniques. The outcomes confirm the model’s exceptional performance in information extraction tasks related to ID cards, laying the foundation for advancements in document analysis, identity verification, and broader applications in the field of information retrieval. en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering & Computer Science (SEECS), NUST en_US
dc.title ExtractID: Deep Information Extraction from Identity Documents en_US
dc.title.alternative ExtractID: Deep Information Extraction from Identity Documents en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [881]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account