NUST Institutional Repository

Human Activity Recognition using Skeleton Information using 2D Data

Show simple item record

dc.contributor.author Asif Mattoo, Fahad ul Hassan
dc.date.accessioned 2023-07-11T04:18:20Z
dc.date.available 2023-07-11T04:18:20Z
dc.date.issued 2023
dc.identifier.other 319806
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/34537
dc.description Supervisor: Dr. Umar Shahbaz Khan en_US
dc.description.abstract Video data is ubiquitous and rich in information, but how can we automatically recognize the actions and activities of humans and objects in it? This is the challenge of video-based activity recognition, which has become increasingly important for many real-world scenarios. However, most of the existing methods are tailored for specific use cases and lack a generic approach that can handle any action class regardless of the video data source or quality. In this thesis, we propose a novel end-to-end framework and a deep learning model named PCM for multiple-person action recognition. PCM is based on 1D convolutional layers and has multiple streams that capture motion and spatial information from pose data. The framework consists of multiple elements, including a pre-processing pipeline, a data augmentor, a pose estimation module, and an object tracking module. Our framework and model can be trained on various action classes from different datasets and can use either 2D or 3D pose data as input. We evaluated the performance of our pose classification model and our framework separately on various datasets. The pose classification model achieved an accuracy of 81.3%, 95.7%, and 94.9%, for joint annotated human motion database ground truth (JHMDB-GT), skeleton-based hand gesture recognition (SHREC), and first-person hand action benchmark (FPHAB) datasets respectively. The framework achieved an accuracy of 72.8% on JHMDB. To test the robustness and genericity of our framework, we applied it to a variety of datasets covering different action classes and video data sources and obtained satisfactory accuracies for each one. To test the applicability of our system for real-world scenarios, we performed a speed analysis by running our system on a video stream and measuring the number of frames processed per second. We obtained 485, and 22 frames per second for the pose classification model, and the framework, respectively. The results indicate that our end-to-end framework for multiple-person action recognition has a higher potential for real-time applicability in diverse applications, such as security, healthcare, and behavior analysis. Moreover, our framework can be extended for interaction-based action recognition and complete video analytics. en_US
dc.language.iso en en_US
dc.publisher College of EME, NUST en_US
dc.subject Activity recognition, Pose classification, Skeleton Information, Multiple-person, Feature Fusion en_US
dc.title Human Activity Recognition using Skeleton Information using 2D Data en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [205]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account