Abstract:
In this work, we have analyzed the working of Big transfer (BiT)
model for three kinds of tasks which are object detection, image classification
and instance segmentation. First of all, in object detection task we have used
the BiT model to be used as backbone for performance analysis on two
famous medical datasets which are diabetic retinopathy and chexpert
datasets. Diabetic retinopathy dataset consists of color images whereas
chexpert images are of grey scale type. Diabetic retinopathy dataset consists
of 413 images whereas chexpert dataset consists of 223,650 images. After
training and testing phase, the precision and recall values for the object
detection are 0.978 and 0.960. Therefore, with both precision and recall
values, we can evaluate the accuracy on testing images (F1 score) of 0.969
for object detection. Secondly, we have gathered a large medical dataset of
124 classes for classification purpose. We have analyzed the performance of
BiT model variants in this task and then compare it with the ResNet50 and
ResNet152 Convolutional Neural Network (CNN) models. In all these tasks,
the BiT model variant BiT M 50x3 achieves the highest accuracy in both the
training and the testing phase. Thirdly, we tried to decrease the number of
images per class but again BiT model outperformed all the other CNN
models. Fourthly, we have tried the stylegan technique to step up the number
of images per class (only for those medical datasets which have very less
images per class) and then evaluate the performance. Again, the performance
of BiT model is highest among the respective CNN models. Lastly, we have
analyzed the performance of BiT model in instance segmentation task on two
famous datasets. These datasets include spleen dataset and colon dataset. The
images of both of these datasets are of grey scale type. Spleen dataset
consists of 63 grey scale images whereas colon dataset consists of 219 images of grey scale
type. In instance segmentation, we have used BiT model as backbone for
evaluating the performance. After training and testing phase, the precision
and recall parameters for the instance segmentation are 0.923 and 0.915.
Therefore with both precision and recall values, we can evaluate the accuracy
on testing images (F1 score) of 0.920 for instance segmentation.