NUST Institutional Repository

DOCUMENT TOPIC GENERATION IN TEXT MINING BY USING CLUSTER ANALYSIS ANALYSIS WITH ENHANCED ROCK (EROCK

Show simple item record

dc.contributor.author AHMAD, RIZWAN
dc.date.accessioned 2023-08-25T10:13:47Z
dc.date.available 2023-08-25T10:13:47Z
dc.date.issued 2010
dc.identifier.other [2006-NUST-MS PhD-CSE (E)-04]
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/37541
dc.description Supervisor: DR.AASIA KHANUM en_US
dc.description.abstract Clustering is a useful technique in the field of textual data mining. Cluster analysis divides objects into meaningful groups called clusters based on information and relationship between objects. Bunch of material is available related to any topic from internet by just one click. It becomes tedious on user’s end to differentiate between data and really required information. This task is very hard as it has to be done manually. This project will explain how to cope with this problem to effectively facilitate the user. We used ROCK algorithm with some modifications. ROCK generates better clusters than other clustering algorithms for data with categorical attributes. We used cosine measure to know the similarity between two documents. Furthermore, we used adjacency list instead of sparse matrix to store the document. The evaluation of algorithm has been done on text documents. Due to these enhancements it is named as Enhanced ROCK or EROCK. These changes affect the time space complexity of the algorithm. Experimental results on standard test documents show the outcomes of the EROCK algorithm. Similarity threshold, number of clusters to be obtained and text documents (corpus) are the main parameters used for EROCK evaluation. JAVA with jdk1.6.0 has been used for implementation of the EROCK. NetBeans IDE 6.5.1 has been used as a development editor. Experiments have been carried out on a variety of standard text documents with specific approach. en_US
dc.language.iso en en_US
dc.publisher College of Electrical & Mechanical Engineering (CEME), NUST en_US
dc.title DOCUMENT TOPIC GENERATION IN TEXT MINING BY USING CLUSTER ANALYSIS ANALYSIS WITH ENHANCED ROCK (EROCK en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [441]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account