A Comparative Analysis of Camera, LiDAR and Fusion Based Deep Neural Networks for Vehicle Detection

Sajjad, Shafaq

DSpace Home
→
E-Theses
→
CEME
→
Electrical Engineering
→
MS
→
View Item

A Comparative Analysis of Camera, LiDAR and Fusion Based Deep Neural Networks for Vehicle Detection

Sajjad, Shafaq

URI: http://10.250.8.41:8080/xmlui/handle/123456789/36065

Date: 2022

Abstract:

Self-driving cars are an active area of interdisciplinary research spanning Artificial Intelligence (AI), Internet of Things (IoT), embedded systems, and control engineering. One crucial component needed in ensuring autonomous navigation is to accurately detect vehicles, pedestrians, or other obstacles on the road and ascertain their distance from the self-driving vehicle. The primary algorithms employed for this purpose involve the use of cameras and LiDAR data. The third category of algorithms consists of a fusion between these two sensor data. Sensor fusion networks take input as 2D camera images and LiDAR point clouds to output 3D bounding boxes as detection results. In this thesis, we categorize object detection networks on the basis of input data. We experimentally evaluate the performance of three object detection methods. These detection networks are YOLOv3, Birds Eye View (BEV) network and PointFusion. We offer a comparison of three object detection networks by considering the following metrics - accuracy, performance in occluded environment, and computational complexity. The results of various existing procedures are replicated on the KITTI benchmark dataset to demonstrate the contrast. KITTI is a standard dataset used by many academics in their research for vehicle detection. Average Precision (%) reported by YOLOv3, BEV and PointFusion are 42, 45 and 47.8% respectively. Through qualitative and quantitative results, it is shown that the performance of a sensor fusion network is superior to single-input networks.