DEEP LEARNING BASED SIGN LANGUAGE PREDICTION

Ruqaiya Ali, supervised by Dr Hasan Sajid

DSpace Home
→
E-Theses
→
SMME
→
Robotics and Intelligent Machine Engineering
→
MS
→
View Item

DEEP LEARNING BASED SIGN LANGUAGE PREDICTION

Ruqaiya Ali, supervised by Dr Hasan Sajid

URI: http://10.250.8.41:8080/xmlui/handle/123456789/30799

Date: 2022

Abstract:

Speech and hearing impairment is a condition that limits a person's capacity to communicate verbally and audibly. Individuals who are impacted by this adopt sign language and various alternative forms of communication. Even though sign language has become more widely used recently, it is still difficult for non-signers to engage with the individuals that use sign language. There has been promising improvement in the disciplines of motion and gesture detection combining techniques of computer vision and deep learning. This study aims to put forward an approach that uses deep learning techniques to automate the recognition of American Sign Language, thereby lowering barriers to effective communication among the hard of hearing individuals and hearing communities. Previously, several techniques of deep learning were employed for sign language gesture recognition. Video sequences are used as an input for extraction of spatial and temporal information. Word-level sign language recognition (WSLR) technology advancements can drastically reduce the necessity for human translators and enable the signers and non-signers to easily communicate. The majority of methods currently in use rely on the use of extra equipment like sensor devices, gloves, or depth cameras. The ease of usage in real life situations is, however, constrained by these limitations. Such situations may benefit from deep learning techniques that are entirely vision-based and non-intrusive. American Sign Language has its own rules for syntax and grammar, much like any other spoken language. ASL, like every other language, is a living language that evolves and develops through time. The majority of ASL users are found in both Canada and the United States of America. In order to complete their current and "international" degree requirements, most schools and institutions across the US accept ASL. This study uses deep learning methods to predict American Sign Language using the WLASL (word-level American Sign Language) dataset. For the dataset, a subset of 50 classes was chosen from WLASL. This study used a combination of VGG16-LSTM and ConvLSTM based to work with spatio-temporal features. These models were chosen due to their ability to work with spatial and temporal features. We observed that VGG16-LSTM outperformed the ConvLSTM architecture. Both models' performances are examined using accuracy as an evaluation metric and judged according to how well they perform on test videos.

Show full item record