dc.description.abstract |
The advancements in technology in recent times have brought a great influx of data. The remote sensing image datasets have grown in size and numbers. With this abundance of data there comes the problem of retrieving it efficiently for various purposes. Currently, substantial endeavors are underway to articulate novel paradigms, techniques, and technologies improve this process of retrieval of remote sensing data. The heterogeneity of the data and the large semantic gap between text and image modalities makes this an inherently challenging task. Standard retrieval techniques are not effective when it comes to dealing with multi modal remote sensing data. This thesis introduces a purposefully designed framework tailored for the retrieval of targeted images with text query and vice versa. The existing techniques in the context of remote sensing text-image retrieval predominantly emphasize the utilization of high-level or macro features derived from remote sensing (RS) images, consequently resulting in the oversight of pertinent low-level or micro features that convey valuable insights into target relationships and saliency. The proposed model centers on the extraction of image features, subsequently progressing to their cohesive representation dynamic integration. It leverages macro vision features to correct micro vision features, additionally macro vision features are enhanced by micro vision features of the images. Cutting-edge deep learning methodologies are utilized to generate comprehensive representations of both image and text features. After successfully representing the image and text queries, their similarity is calculated and the results are re-ranked. This re-ranking algorithm leverages the k closest neighbors from the retrieval results to conduct a reverse search and, in the process, enhances accuracy through the integration of various bidirectional retrieval components. Predictive evaluation metric Recall is used to compare results for proposed techniques with conventional technique. The proposed solution outperformed on remote sensing datasets: RSICD dataset and RSITMD dataset for the text-image retrieval task. |
en_US |