NUST Institutional Repository

Exploring the Power of Vision Transformers in Remote Sensing Imagery

Show simple item record

dc.contributor.author Shaheen, Muhammad Tariq
dc.date.accessioned 2023-11-30T11:51:31Z
dc.date.available 2023-11-30T11:51:31Z
dc.date.issued 2023
dc.identifier.other 359600
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/40793
dc.description Supervisor: Dr. Hafsa Iqbal en_US
dc.description.abstract Semantic segmentation of aerial images is vital for Unmanned Aerial Vehicle (UAVs) applications, such as land cover mapping, surveillance, and identifying flood-affected areas for effective natural disaster management and flood impact mitigation. Traditional CNN-based techniques encounter significant challenges in retaining specific information from deeper layers. Moreover, existing transformer-based architectures often demand high computational resources or produce single-scale, low-resolution features. To address these limitations, we proposed a novel transformer-based model named SwinSegFormer that harnesses the strengths of SegFormer and Swin Transformer (SwinT). Our model was trained on the FloodNet dataset and benchmark evaluations, focusing on challenging classes such as vehicles, pools, and flooded and non-flooded roads, which are crucial to segment for effective disaster management. This potentially allows our model to be utilized in first aid activities during floods. The proposed model achieved notable results with a validation mIoU of 71.99%, mDice of 82.86%, and mAcc of 82.69%. This represents an 8-10% improvement compared to state-of-the-art methods. en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering and Computer Science, (SEECS), NUST en_US
dc.title Exploring the Power of Vision Transformers in Remote Sensing Imagery en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [881]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account