Efficient Deployment of Object Detection Model (YOLO) on NVIDIA Jetson Devices Through Model Quantization

Minallah, Haady um

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

Efficient Deployment of Object Detection Model (YOLO) on NVIDIA Jetson Devices Through Model Quantization

Minallah, Haady um

URI: http://10.250.8.41:8080/xmlui/handle/123456789/45019

Date: 2024-07-29

Abstract:

This thesis investigates the effective implementation of the YOLO object recognition model on NVIDIA Jetson through the application of model quantization techniques. Specifically, the research focuses on Quantization-Aware Training (QAT) and Asymmetric Quantization to optimize the model's performance on resource inhibited edge computers. NVIDIA Jetson devices, compatible and aimed at handling AI tasks in edge computing scenarios, often face limitations in memory, power, and computational capacity. The research evaluates the baseline performance of the YOLO model on a standard NVIDIA Jetson device and detail the methodologies of applying QAT and Asymmetric Quantization, followed by a comparative analysis of their effects. The results indicate that while quantization techniques lead to a slight decrease in accuracy, they substantially enhance inference time. This improvement in inference speed underscores the potential for deploying the quantized YOLO model in real-time scenarios where inference time is prioritized over accuracy. This thesis contributes to the fields of edge computing and realtime image processing by providing a comprehensive framework for deploying highperformance AI models in constrained environments. The findings demonstrate that model quantization is a viable strategy for achieving efficient and robust real-time object recognition on devices that have resource limitations.