Imaging Solution by a Fusion of Thermal and Color Images using the Convolutional Neural Network

Khalid, Bushra

DSpace Home
→
E-Theses
→
CEME
→
Computer Engineering
→
MS
→
View Item

dc.contributor.author	Khalid, Bushra
dc.date.accessioned	2023-08-10T05:21:44Z
dc.date.available	2023-08-10T05:21:44Z
dc.date.issued	2018
dc.identifier.other	00000172621
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/36143
dc.description	Supervisor: Dr. Muhammad Usman Akram	en_US
dc.description.abstract	In Computer vision, object detection and classification are active fields of research. Applications of object detection and classification includes a diverse range of fields such as surveillance, autonomous cars, robotic vision, search and rescue, driver assistance systems and military applications. Many intelligent systems are built by researchers to achieve the accuracy of human perception but could not quite achieve it yet. In the last couple of decades, Convolution Neural Network (CNN) emerged as the most active field of research. There are a number of applications of CNN, and its architectures are used for the improvement of accuracy and efficiency in various fields. In this research, we aim to use CNN in order to generate fusion of visible and thermal camera images to detect persons present in those images for a reliable surveillance application. There are various kinds of image fusion methods to achieve multi-sensor, multi-modal, multi-focus and multi-view image fusion. Our proposed methodology includes Encoder-Decoder architecture for fusion of visible and thermal images, ResNet-152 architecture for classification of images. KAIST multi-spectral dataset consisting of 95,000 visible and thermal images is used for training of CNNs. During experimentation, it is observed that fused architecture outperforms individual visible and thermal based architectures, where fused architecture gives 99.2% accuracy while visible gives 99.01% and thermal gives 98.98% accuracy. Images obtained from ResNet-152 are then fed into Mask-RCNN for localization of persons. Mask-RCNN uses ResNet-101 architecture for localization of objects. From the results it can be clearly seen that Fused model for object localization outperforms the Visible model and gives promising results for person detection for surveillance purposes. Our proposed localization module gives a miss rate of 5.25%, which is 5 percent better than previous best techniques proposed.	en_US
dc.language.iso	en	en_US
dc.publisher	College of Electrical & Mechanical Engineering (CEME), NUST	en_US
dc.subject	.	en_US
dc.title	Imaging Solution by a Fusion of Thermal and Color Images using the Convolutional Neural Network	en_US
dc.type	Thesis	en_US