Abstract:
This thesis aims to target the detection and classification of vehicles in images and videos of
local traffic of Rawalpindi/Islamabad by utilizing the YOLO-v5 architecture. The YOLO-v5
has surpassed other traditional object detection algorithms. The YOLO-v5 is computationally
faster in comparison of other YOLO algorithms. We propose to employ transfer learning to
fine tune the weights of the pre-trained YOLO-v5 fine-tune the weights of the YOLO-v5
network that has already been trained network so that they are accustomed according to our
local traffic patterns. For this purpose, extensive data sets of images and videos of the local
traffic patterns were collected. These data sets were made comprehensive by targeting various
attributes like high density traffic patterns, low density traffic patterns, occlusion, and various
weather conditions. All of these data sets were manually annotated. By fine-tuning the pretrained network weights with the help of our data sets we achieved better detection and
classification results.
Object detection and recognition is one of the most difficult applications of computer vision,
machine learning, and artificial intelligence, and it is widely employed in a variety of fields.
For example, Robotics, security, surveillance, and to guide visually impaired people.
Aforedescribed methods works differently with their network architectures with the main aim
to detect multiple objects that appear in an image. With the rapid development of deep learning,
many algorithms are consistently improving the relationship between video analysis and image
understanding. We are optimistic that our developed method is one of the latest additions in
this domain.