NUST Institutional Repository

Text Mining through Modified Label Induction Grouping Algorithm

Show simple item record

dc.contributor.author Gulshan Saleem
dc.date.accessioned 2020-12-31T10:53:18Z
dc.date.available 2020-12-31T10:53:18Z
dc.date.issued 2016
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/20266
dc.description Supervisor: Usman Qamar en_US
dc.description.abstract Label Induction Grouping Algorithm (LINGO) is a grouping algorithm which is capable of grouping text documents on the basis of similar contents. Mainly, LINGO has its application in clustering documents incurred from web search engines and this way it also works like a search engine. This algorithm performs its job in two main phases. The first phase of LINGO is mainly comprised of inducing labels for the forming cluster using text documents and in the second phase, its job is to provide/assign contents to these labels of clusters. The first phase of LINGO which is label induction uses a famous information retrieval algorithm which is latent semantic indexing analysis and so it induces labels of the cluster. The second phase of algorithm is content discovery and contents of clusters are discovered using another information retrieval method which is vector space model. This study is basically modification of already existing algorithm and we have modified the method of content assignment to the induced labels. Latent semantic indexing analysis is used for the content assignment of the clusters as well as for the label induction which can provide us more improved recall and performance of the algorithm is also significantly improved in terms of cluster quality and also the overlapping is reduced by introducing merge operation before formation of final cluster. For evaluation of new proposed algorithm, 20 news group dataset is used and the whole research is performed using MATLAB version 2016b. en_US
dc.publisher EME, National University of Science and Technology , Islamabad en_US
dc.subject Computer Engineering en_US
dc.title Text Mining through Modified Label Induction Grouping Algorithm en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [331]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account