NUST Institutional Repository

Large Scale Fine-Grained Visual Recognition

Show simple item record

dc.contributor.author Usman, Omer
dc.date.accessioned 2023-08-19T11:31:03Z
dc.date.available 2023-08-19T11:31:03Z
dc.date.issued 2021
dc.identifier.other 203515
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/36951
dc.description Supervisor: Dr. Muhammad Shahzad en_US
dc.description.abstract Computer Vision depends on pattern recognition techniques to self-train and understand visual data. While machine learning algorithms were previously used for computer vision applications, now deep learning methods have evolved as a better solution for this domain. Deep learning counts on neural networks and samples for problem solving. It self-learns by using labeled data to recognize common patterns in the given data set. The extensive availability of the data used for training computer vision algorithms has contributed in driving the growth of computer vision. Image Classification is a fundamental task that attempts to comprehend an entire image as a whole. The goal is to classify the image by assigning it to a specific label. Typically, Image Classification refers to images in which only one object appears and is analyzed. CNN-based models hold state-of-the-art performance in various computer vision tasks, including image classification. Deep CNNs are well suited for large-scale supervised visual recognition tasks because of their highly scalable training algorithm. Two main methods are used for image classification via deep CNN namely Large Scale Visual Recognition and Fine Grained Visual Recognition. The availability of highly advanced models from competitions such as ImageNet has made it possible to explore fine grained classification and other non-deep learning classifiers for classification. This research presents a combination of models in hierarchical layers to first distinguish between meta categories (currently Large Scale Recognition algorithms) and then go further in depth to classify into individual objects (currently Fine grained Recognition algorithms). The aim is to develop an approach in which initially the large scale visual recognition problem is solved and then in the same pipeline fine grained visual recognition is also performed in one go. This research presents a single solution to both the problems. The evaluation of the model was done by submitting the predicted score on Kaggle. The pipeline requires annotation of the data in broad groups and then another set of annotation within the broad groups like specie class. This way the problem was broken down to solve it independently. The simulations were performed on state of art deep learning techniques such as ResNet-18, ResNet-50 and ResNeXt-101. The results show visible improvement with respect to single model approach. It is concluded that as far as the resource requirements are concerned, the proposed methodology has high resource re quirements in terms of memory space and training time. However, it is the trade-of between resource requirement and performance improvement that needs to be decided. en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering and Computer Science NUST SEECS en_US
dc.title Large Scale Fine-Grained Visual Recognition en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [376]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account