Detection and Recognition of Medicine Packaging and Information Using Deep Learning and Computer Vision

Rukhsar, Muhammad Khurram

DSpace Home
→
E-Theses
→
SMME
→
Robotics and Intelligent Machine Engineering
→
MS
→
View Item

Detection and Recognition of Medicine Packaging and Information Using Deep Learning and Computer Vision

Rukhsar, Muhammad Khurram

URI: http://10.250.8.41:8080/xmlui/handle/123456789/47355

Date: 2024

Abstract:

Pharmaceutical industry worldwide is worth 1.48 trillion USD [1]. In this industry there is a need for automation, especially in the pharmaceutical industry of the developing world. This thesis investigates the application of object detection and optical character recognition (OCR) techniques to detect the medicine boxes and extracting information from them. The primary objective is to develop a deep learning-based system that can accurately detect and recognize medicine boxes and extract critical information such as batch number, expiry date, manufacturing date and retail price. The study compares the performance of five prominent object detection models, YOLOv8n, YOLOv8s, YOLOv9t, YOLOv9c and YOLOv10n, trained, tested and validated on a custom dataset comprising of 21,579 images. The methodology involves first training a number of different YOLO and RTDETR models with different epochs on a similar but smaller exploratory dataset of medicine boxes. A total of 68 models were trained on the exploratory dataset. These models were then evaluated based on their mAP50-95 vs Inference time graphs. The best five models that encompassed a balance of precision and speed or were either the fastest or most precise were chosen. These selected models were then trained on a bigger dataset, termed as the primary dataset. A variation of epochs was used to train the five models from scratch. The results demonstrated that for the exploratory dataset YOLOv8n, YOLOv8s, YOLOv9t, YOLOv9c and YOLOv10n performed the best. These five models were then further used for the next phase of research where they were trained on the primary dataset. The model that performed the best out of the five in terms of speed and accuracy was then integrated with a GUI system built for the industry. This system performs inference on the medicine boxes to detect them and extracts crucial information form them using the OCR. Future work will focus on the areas for improvement, i.e., expanding the dataset, refining the models and exploring additional features, with the aim of further enhancing the system's performance and applicability.