NUST Institutional Repository

ASRB: A Novel Automatic Speech Recognition for Spoken Burushaski Language

Show simple item record

dc.contributor.author Wali, Hussain
dc.date.accessioned 2023-06-23T09:20:24Z
dc.date.available 2023-06-23T09:20:24Z
dc.date.issued 2023
dc.identifier.other 321147
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/34190
dc.description Supervisor: Dr Muhammad Khuram Shehzad en_US
dc.description.abstract With this thesis, our aim is to establish a foundation for research and development of the Burushaski language. We will construct the first ever audio and textual dataset that can be used for future research. Our final goals include the development of a Latin-based script, a structured and clean audio dataset, a usable text corpus, and an initial Automatic Speech Recognition (ASR) system using the Kaldi toolkit based on the developed datasets for the Burushaski language. The Burushaski language is a language isolate and is considered one of the most difficult languages to learn and model. In this paper, we present the first ever open source free database of audio and text datasets of the Burushaski language collected from speakers. Additionally, we present a continuous Burushaski speech recognition model using the Kaldi toolkit. From continuous speech samples of the Burushaski language audio dataset, we extracted Mel frequency cepstral coefficients (MFCC) features for the ASR system. We provide detailed reports on the performance of the ASR system for both monophone and triphone models, including tri1, tri2, and tri3 models using N gram language model. The word error rate (WER) is the metric on which we measured the performance of the system. We trained the system on a limited dataset and noticed that the triphone model (tri3) gives significantly better performance compared to the monophone model system. The tri3 model has also performed much better than the tri2 model, and the tri2 model has better performance than the tri1 model ASR. We also present a detailed framework that can be used to design and develop systems to create ASR systems for other zero-resource languages. This framework can be used for dataset generation any any language. en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering and Computer Sciences (SEECS), NUST en_US
dc.title ASRB: A Novel Automatic Speech Recognition for Spoken Burushaski Language en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account