Abstract:
In recent years, deep learning has gained much popularity over traditional machine learning
techniques in terms of accuracy and precision when trained on substantial amount of data. In this
research, a state-of-the-art deep learning technique has been employed for classification and
prediction of cassava leaf diseases. Being the second largest producer of carbohydrates in the
world, cassava plant has become an important source of calories for people in tropical regions, but
it is highly susceptible to viral, bacterial, and fungal attacks resulting in stunted plant growth and
hence the yield. The dataset that is used in this research is taken from a Kaggle competition
containing 21,397 images of cassava plant leaves belonging to 5 classes: Cassava Bacterial Blight,
Cassava Brown Streak Disease, Cassava Green Mottle, Cassava Mosaic Disease and Healthy leaf.
In this research work, EfficientNet models were trained using transfer learning approach. Further,
to remove background noise, Segmentation was performed using U-Net to extract only the leaves
from images. Since the dataset was imbalanced, detailed image augmentation was also performed
to increase the sample size of minority classes. Our model provided reasonable performance with
balanced dataset giving 89.97% accuracy. However, original (imbalanced) dataset results were
also comparable to balanced dataset giving mean f1-score of 0.89 and mean accuracy score of
89.73% plus 0.82 standard deviation on segmented dataset trained on EfficientNet model B0 using
7-fold cross validation. For comparison purpose, Kaggle 2019 dataset for cassava disease
classification was used, that gave mean accuracy score of 89.41 ± 1.62 using 7-fold cross validation
and f1-score of 0.9 leading all state-of-the-art results on same dataset.