Comparison of deep learning techniques for Sindhi language speech recognition

Nawaz, Muhammad

DSpace Home
→
E-Theses
→
SEECS
→
Computer Science
→
MS
→
View Item

dc.contributor.author	Nawaz, Muhammad
dc.date.accessioned	2023-08-19T10:41:36Z
dc.date.available	2023-08-19T10:41:36Z
dc.date.issued	2021
dc.identifier.other	20344
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/36940
dc.description	Supervisor: Muhammad Nawaz	en_US
dc.description.abstract	With great technological advancements made in computational powers, Automatic Speech Recognition (ASR) systems have seen a surge in interest and usage. Much research has been done in ASR systems in languages like Chinese, English, Spanish, Korean or even in our national language Urdu, resulting in a better Human Computer Interaction (HCI).But there is a dearth of speech recognition systems done in regional and local languages like Sindhi. Over 30 million speakers of Sindhi Language in Pakistan are unable to communicate with a machine in Sindhi which is a great hurdle in uti lizing the best of what technology has to offer. Automatic Speech Recognition (ASR) systems specifically built for local languages can help in overcoming these hurdles. In this study a speech recognition system for Sindhi language has been built with Kaldi toolkit. Hidden Markov Models (HMM) have been used along with Guassian Mixture Models (GMM) and Deep Neural Networks (DNN). Experiments have been conducted on GMM-HMM and DNN-HMM techniques regarding noise, training size, phonetic dictionary size and DNN parameters. DNNs were tested and compared using parameters such as value of p in p-norm non-linearity, number of hidden layers and learning rates. DNN with 6 hidden layers and p=2 gave best results. Accuracy of our speech recognition system is measured in Word Error Rate (WER). Experiments have been carried out on various speech recognition models and recipes for improved WER and results. These results could then be utilized in different areas like navigation, home automation etc. to increase HCI and usage of technology by Sindhi speakers.	en_US
dc.language.iso	en	en_US
dc.publisher	School of Electrical Engineering and Computer Science NUST SEECS	en_US
dc.subject	Sindhi Speech Recognition, Kaldi, HCI, WER	en_US
dc.title	Comparison of deep learning techniques for Sindhi language speech recognition	en_US
dc.type	Thesis	en_US