dc.description.abstract |
Remote sensing (RS) datasets have gained popularity due to their impact on addressing
global issues such as food security and climate change. The crop type data availability
is valuable for agronomy managers to address food security problems and sustainable
agricultural expansion. However, large-scale crop area maps are unavailable in developing countries due to the lack of information on cropland data and field boundaries.
Therefore, this research aims to design a framework for generating pseudo labels for the
satellite data using minimal human effort, particularly in regions like Pakistan, with a
focus on rice as a case study.
A two-stage framework is proposed. The first stage refers to the generation of ground
truth information in the regions where no prior information is available, and the second
stage is designed to evaluate the effectiveness of the two approaches in expanding the
dataset further by generating pseudo labels. The generation of ground truth data for
rice involves the identification of rice fields and their labelling. A vegetation indicesbased analysis is designed in Google Earth Engine (GEE), which comprises of temporal
Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index
(NDWI) to identify rice fields using a rule-based method. The second stage involves the
evaluation of two different approaches: active learning and traditional machine learning for pseudo-label generation. The active learning-based framework involved iterative
training of the ConvLSTM-based encoder-decoder model to learn the temporal, spectral,
and spatial features effectively. After 10 iterations, model produced labels for the Gujranwala region with 0.69 IoU. However, when the model transferability was evaluated in
the Sargodha area, the generated labels had 0.45 IoU. The testing results indicated that
the model needs further retraining on diverse data samples using human-in-loop. However, the implementation of the human-in-loop module in the active learning framework
added complexity to the training process as the labelling requires temporal vegetation analysis. Therefore, the traditional machine learning-based approach is comparatively
simpler and more efficient for generating large-scale data that can be further utilized in
other applications as well as training more complex models for segmentation. To gener ate a crop map using traditional machine learning models, the random forest is trained
on the provided dataset of the Bahawalnagar region for the year 2021. The feature set
for the random forest model is identified as the performance of machine learning models
relies heavily on feature engineering. Hence, a series of experiments is designed to in vestigate the effect of different temporal, spectral, and sensor impacts on accuracy. The
results showed that for rice mapping, a combination of multi-spectral and radar-based
sensors yields high accuracy. Similarly, the inclusion of temporal and all spectral infor mation will enhance the accuracy. Furthermore, the model’s generalizability across the
regions within in the same year is also assessed. The results revealed that the model can
be used to predict rice in the Gujranwala region as it performed well on the hand-labelled
dataset of the Gujranwala region, providing an accuracy of 93%. Similarly, it gave 74%
accurate results for the Sargodha region. Moreover, the model’s transferability across
the years is also tested by evaluating its performance on the dataset for the year 2023,
keeping the geographical region consistent, i.e., Bahawalnagar. The results showed that
the model is transferable across years as it predicted rice fields with 78% accuracy, even
though there has been a shift in climate over two years. |
en_US |