NUST Institutional Repository

Hyper-spectral Document Image Segmentation

Show simple item record

dc.contributor.author Mirza, Hasan Irtaza
dc.date.accessioned 2022-08-17T06:36:41Z
dc.date.available 2022-08-17T06:36:41Z
dc.date.issued 2022
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/30083
dc.description CL-T-6646 en_US
dc.description.abstract Separating overlapping text and non-text from images is one of the challenging tasks in document analysis and segmentation. It was previously difficult to separate text and non-text due to a lack of information available in existing signature datasets consisting of RGB Images. A hyper-spectral image depicts pixel information into a number of chan nels than can overcome the constraint of limited pixel information with reference to RGB images. There is no public dataset for hyper-spectral document images available, hence performed experiments on collected HSI document images captured by hyper-spectral specialized cameras. Different techniques like FIPPI, PPI, NFINDR, and ATGP are used for extraction of end members e.g. Signature, printed text, etc. The HSI image is regenerated based on extracted end members using SAM and SID. A process is proposed in this research which use these end-member extraction methods with a combination of 5 steps. HSI Image is preprocessed in first step and its spectral signature is generated in the second step using end-member extraction techniques later it is converted into an image using SAM and SID Classifiers in the third step and post-processed using the connected component analysis-based post-processing technique in the fourth step. The post-processed image is evaluated using precision, recall, and f1 score in the last step. The proposed methodology is producing more than 50% precision. Deep learning pro vides a better approach and better results than traditional processes. Overlapping and Non-overlapping signatures on printed text can also be separated using deep learning. A Hybrid SN [1] named deep neural network-based technique is available for extraction of end members. It uses a combination of 3D and 2D CNN for learning end members and then extracts end members from HSI Image. It is applied to the current private HSI dataset for the separation of overlapping signatures on printed texts and a precision of 78% was achieved. The separation of signature and printed text is solved using these two proposed approaches and it can be improved in upcoming research. en_US
dc.description.sponsorship Dr. Imran Malik en_US
dc.language.iso en en_US
dc.publisher SEECS-School of Electrical Engineering and Computer Science NUST Islamabad en_US
dc.title Hyper-spectral Document Image Segmentation en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [375]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account