PARALLEL ASSOCIATION RULES MINING FOR URDU LANGUAGE

ASAD, NAZISH

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

PARALLEL ASSOCIATION RULES MINING FOR URDU LANGUAGE

ASAD, NAZISH

URI: http://10.250.8.41:8080/xmlui/handle/123456789/37283

Date: 2011

Abstract:

Extensive research has been conducted in the field of Association Rules Mining (ARM) for natural languages. Both the academia and researchers have conceived many applications of ARM for several domain of Urdu language (i.e. education, publications and web development). Many of these applications require accuracy. Accuracy is a computation and communication intensive. Severe lack of resources and limited capability has made providing accuracy a challenging task in ARM. Therefore techniques need to be devised that provide accuracy without compromising the limited resources available to ARM. This thesis focuses on the design, implementation and analysis of an Urdu Mining Model (UMM) based upon Hybrid Apriori (HA), Enhanced Multipass with Inverted Hashing and Pruning (MIHP), Enhanced Parallel Multipass with Inverted Hashing and Pruning (EPMIHP) that fulfills the requirements of accuracy and efficiency without compromising resources. Optimization of algorithms is achieved by using Hash Tables and avoiding mining of redundant rules. The proposed UMM, HA, EMIHP and EPMHIP has been implemented using C# and Microsoft SQL Server 2005 in the .NET framework. In order to evaluate the system, several experiments have been carried out on different number of words using different thresholds. Evaluation of UMM, HA, EMIHP and EPMIHP have been done by conducting a comprehensive efficiency analysis. Furthermore, the results of execution time for HA, EMIHP and EPMIHP have been compared with other algorithms. The reduction in execution time and increase inaccurate association rules prove that HA, EMIHP and EPMIHP is a very viable choice for association rules of Urdu language. Since the HA, EMIHP and EPMIHP provides efficiency and accuracy and operated within the available resources therefore they promise to widely accept.