Abstract:
Breast cancer being the most invasive type of cancer in women requires early diagnosis. Recent studies have proposed various machine learning classifiers exhibiting accurate results. One of the limitations of these methods is that they require well labeled data and existence of sufficient amount of accurately labeled data is a challenging task in medical imaging. To overcome this, semi supervised learning algorithms could be utilized which have the ability to exploit labeled and unlabeled data together. In this work the aim is to classify breast cancer histopathological image dataset by extracting deep and radiomic features and compare the performance of different classifiers in supervised and semi supervised learning approaches. The extracted features using transfer learning approach of VGG16, VGG19 and RESNET50 architectures from histopathological images are used as input to train 6 different supervised learning classifiers. A semi-supervised learning approach is developed by training the classifier on labeled data, predicting the labels for unlabeled and combining both for further evaluation. Another feature extractor called radiomics is also proposed for extraction of a set of handcrafted features to train the classifiers. The evaluation results of SL classifiers are compared with semi-supervised learning approach for all the types of extracted features from BreakHis400X image dataset and hyperparameters are optimized in such a way to ensure robustness of the model. The comparison of the proposed approaches is achieved using accuracy, precision, recall, F1-score, and ROC curves. Classification results of deep features outperformed those of radiomic features. It is evident from the results that both the SL and SSL classifications yielded the highest accuracies ranging from 80.2% to 92.2% for SL and from 79.2% to 89.9% for SSL for deep features extracted using RESNET50 as compared to other feature extractors. Classification using radiomics resulted in accuracies between 65.3% and 84.2% for SL and between 64.9% and 84.5% for SSL techniques. SL and SSL classifiers for all the methods resulted in almost similar accuracies. LR and SVM using VGG16, VGG19 and RESNET50 and RF classifier using radiomic features outperformed all other classifiers.