Zero Shot Vehicle Re-Identification

Nasir, Muhammad Zohaib

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

Zero Shot Vehicle Re-Identification

Nasir, Muhammad Zohaib

URI: http://10.250.8.41:8080/xmlui/handle/123456789/35584

Date: 2020

Abstract:

Vehicle re-identification is very useful in intelligent traffic monitoring systems. Its application is not just limited to vehicle monitoring or surveillance but having an efficient vehicle re-identification procedure allows a system to accurately and timely detect/track a vehicle, which can play an important part in doing forensic analysis as well. The procedure that will be based on deep neural network where a random camera input of vehicle ID will be given to the system and the system will learn to distinguish between different vehicles. Most of the current algorithms solve this problem in fully-supervised manner that require large number of labeled training data. However, it is almost impossible to get large labeled dataset due to high cost. Besides this, in practical scenarios, testing data contains unseen vehicle images on which model is not trained. So, a more robust model is required to handle unseen data. Zero Shot Vehicle Re-Identification, an unsupervised model is proposed to handle unseen data to handle real time data. Two consistencies are proposed to work the model on unseen data, cross view support consistency (CVSC) and cross view projection consistency (CVPC). Let’s suppose we have vehicle images of two cameras Ca and Cb. In spite of images important viewpoints distortion and object occlusion, it can be said that visual appearance of images from Ca to Cb will face same illumination changes and blur variation. So, vehicle image from camera Ca can be denoted with images from Cb and vehicle image in camera Cb can be denoted with images in camera Ca. Cross view support consistency says that one image can be represented by other images by sparse coding. So, representation of probe and gallery images are selected and those gallery images are selected whose representatives have maximum overlap with gallery images representatives. The idea behind cross view projection consistency is that probe and gallery image of same vehicle should have more common neighborhoods than probe and gallery image of different vehicles. The neighborhood of the vehicle images are identified by vii calculating Euclidean distance between gallery and probe image. KNN of gallery and probe images are selected and the gallery images who have more overlapping neighborhoods with neighborhoods of probe image have stronger projection consistency with the probe image. The neighborhoods of image of camera Ca is directly calculated by taking Euclidean distance, but for neighborhoods of images of Cb, first images of Cb and basic reference subset in Ca is projected to virtual camera Cv then distance is calculated by the learnt metric.