Abstract:
Speech recognition is one of the significant topics in recent times. In Human computer interaction, we study the ways for efficient interaction between human and computer. Typing with a keyboard to touch screen, CLI to GUI, this interaction is getting better and better. The speech also plays an important role to make it more efficient not for just disable persons but also healthy persons. To interact with the computer in speech, the computer should be able to understand spoken words. For this purpose spoken data is converted into written data and then it is used for further processing. In this thesis, the primary focus in the Urdu language. Urdu data collected from multiple sources cleaned the data and then trained this data using CMUSphinx which is HMM base tool. More than 83 hours data by 181 speakers, is used for training and more than 8 hours data by 20 speakers is used for testing. Minimum achieved WER is 35.6% against 5 testing speakers and 44% for 20 speakers which is best in all published papers of Urdu base ASRs. After the training, the acoustic model is created which is used in two Android based application. First is, an automatic technical support system, where the user calls to get technical support from his/her internet provider. The system understands his question and replies with the appropriate answer. If the system is unable to understand the question it asks again to speak. The second application is command based application where the user gives the command to system and system understand and acts accordingly. Another module of this application is speech and type where the user speaks something and it is changed to written text then this text can be copied and shared. Another desktop application is also an Urdu base application it is a helper application to remove errors and to reduce the error rate. Where researcher can compare testing input text and recognized text