SPEAKERS IDENTIFICATION SYSTEM FOR CORE NETWORKS USING HADOOP CLUSTERS

DR SHOAB A KHAN, MAVERA,MARIA

DSpace Home
→
E-Theses
→
CEME
→
Computer Engineering
→
BS
→
View Item

SPEAKERS IDENTIFICATION SYSTEM FOR CORE NETWORKS USING HADOOP CLUSTERS

DR SHOAB A KHAN, MAVERA,MARIA

URI: http://10.250.8.41:8080/xmlui/handle/123456789/52503

Date: 2012

Abstract:

This project implements the capability of distributed computing using the application of Speaker identification. A database approach is utilized for detection of specific person where the voice of unknown speaker is compared with already saved data. A cluster of about 5 computers is established by using the Hadoop on which all the processing occurs. First a database of voice features is created and maintained in Hadoop’s database known as HBase. The database is recorded from the various students and teachers in the college. The training is done by first pre-processing the voice inputs for some noise removal. Then Mfcc features are extracted from these voice signals using Matlab VoiceBox. Then a compression technique, Vector Quantization, is used to reduce the number of feature vectors obtained from Mfcc. K-Means is applied as a clustering technique and then saved into HBase database. This is the training data. Then an unknown speaker voice signal is taken as input and after pre-processing Mfcc features are extracted and then compared with every entry of database for the most likely voice features. The matching part is done by using the technique of parallel processing. Inputs are distributed over the cluster and MapReduce tasks are performed on all the nodes running TaskTracker. All the nodes are controlled by a Master node on which Jobtracker and Namenode are running. Major applications of the project include site access, credit card authorization, secure phone access to banking, database services, and access to secure equipment etc.