Generating Pseudo Labels for Rice Crop Classification in Punjab Region: A Two-Stage  Approach

Murtaza, Ramesha

DSpace Home
→
E-Theses
→
SEECS
→
Artificial Intelligence
→
MS
→
View Item

dc.contributor.author	Murtaza, Ramesha
dc.date.accessioned	2024-07-31T11:52:47Z
dc.date.available	2024-07-31T11:52:47Z
dc.date.issued	2024
dc.identifier.other	362862
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/45093
dc.description	Supervisor: Dr Muhammad Moazam Fraz	en_US
dc.description.abstract	Remote sensing (RS) datasets have gained popularity due to their impact on addressing global issues such as food security and climate change. The crop type data availability is valuable for agronomy managers to address food security problems and sustainable agricultural expansion. However, large-scale crop area maps are unavailable in developing countries due to the lack of information on cropland data and field boundaries. Therefore, this research aims to design a framework for generating pseudo labels for the satellite data using minimal human effort, particularly in regions like Pakistan, with a focus on rice as a case study. A two-stage framework is proposed. The first stage refers to the generation of ground truth information in the regions where no prior information is available, and the second stage is designed to evaluate the effectiveness of the two approaches in expanding the dataset further by generating pseudo labels. The generation of ground truth data for rice involves the identification of rice fields and their labelling. A vegetation indicesbased analysis is designed in Google Earth Engine (GEE), which comprises of temporal Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) to identify rice fields using a rule-based method. The second stage involves the evaluation of two different approaches: active learning and traditional machine learning for pseudo-label generation. The active learning-based framework involved iterative training of the ConvLSTM-based encoder-decoder model to learn the temporal, spectral, and spatial features effectively. After 10 iterations, model produced labels for the Gujranwala region with 0.69 IoU. However, when the model transferability was evaluated in the Sargodha area, the generated labels had 0.45 IoU. The testing results indicated that the model needs further retraining on diverse data samples using human-in-loop. However, the implementation of the human-in-loop module in the active learning framework added complexity to the training process as the labelling requires temporal vegetation analysis. Therefore, the traditional machine learning-based approach is comparatively simpler and more efficient for generating large-scale data that can be further utilized in other applications as well as training more complex models for segmentation. To gener ate a crop map using traditional machine learning models, the random forest is trained on the provided dataset of the Bahawalnagar region for the year 2021. The feature set for the random forest model is identified as the performance of machine learning models relies heavily on feature engineering. Hence, a series of experiments is designed to in vestigate the effect of different temporal, spectral, and sensor impacts on accuracy. The results showed that for rice mapping, a combination of multi-spectral and radar-based sensors yields high accuracy. Similarly, the inclusion of temporal and all spectral infor mation will enhance the accuracy. Furthermore, the model’s generalizability across the regions within in the same year is also assessed. The results revealed that the model can be used to predict rice in the Gujranwala region as it performed well on the hand-labelled dataset of the Gujranwala region, providing an accuracy of 93%. Similarly, it gave 74% accurate results for the Sargodha region. Moreover, the model’s transferability across the years is also tested by evaluating its performance on the dataset for the year 2023, keeping the geographical region consistent, i.e., Bahawalnagar. The results showed that the model is transferable across years as it predicted rice fields with 78% accuracy, even though there has been a shift in climate over two years.	en_US
dc.language.iso	en	en_US
dc.publisher	School of Electrical Engineering & Computer Science (SEECS), NUST	en_US
dc.title	Generating Pseudo Labels for Rice Crop Classification in Punjab Region: A Two-Stage Approach	en_US
dc.type	Thesis	en_US