NUST Institutional Repository

Comparison of deep learning techniques for Sindhi language speech recognition

Show simple item record

dc.contributor.author Nawaz, Muhammad
dc.date.accessioned 2023-08-19T10:41:36Z
dc.date.available 2023-08-19T10:41:36Z
dc.date.issued 2021
dc.identifier.other 20344
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/36940
dc.description Supervisor: Muhammad Nawaz en_US
dc.description.abstract With great technological advancements made in computational powers, Automatic Speech Recognition (ASR) systems have seen a surge in interest and usage. Much research has been done in ASR systems in languages like Chinese, English, Spanish, Korean or even in our national language Urdu, resulting in a better Human Computer Interaction (HCI).But there is a dearth of speech recognition systems done in regional and local languages like Sindhi. Over 30 million speakers of Sindhi Language in Pakistan are unable to communicate with a machine in Sindhi which is a great hurdle in uti lizing the best of what technology has to offer. Automatic Speech Recognition (ASR) systems specifically built for local languages can help in overcoming these hurdles. In this study a speech recognition system for Sindhi language has been built with Kaldi toolkit. Hidden Markov Models (HMM) have been used along with Guassian Mixture Models (GMM) and Deep Neural Networks (DNN). Experiments have been conducted on GMM-HMM and DNN-HMM techniques regarding noise, training size, phonetic dictionary size and DNN parameters. DNNs were tested and compared using parameters such as value of p in p-norm non-linearity, number of hidden layers and learning rates. DNN with 6 hidden layers and p=2 gave best results. Accuracy of our speech recognition system is measured in Word Error Rate (WER). Experiments have been carried out on various speech recognition models and recipes for improved WER and results. These results could then be utilized in different areas like navigation, home automation etc. to increase HCI and usage of technology by Sindhi speakers. en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering and Computer Science NUST SEECS en_US
dc.subject Sindhi Speech Recognition, Kaldi, HCI, WER en_US
dc.title Comparison of deep learning techniques for Sindhi language speech recognition en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [375]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account