NUST Institutional Repository

Summarization of Opinions from Multiple Documents

Show simple item record

dc.contributor.author Rubab Hafeez
dc.date.accessioned 2021-01-27T05:33:04Z
dc.date.available 2021-01-27T05:33:04Z
dc.date.issued 2018
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/21888
dc.description Supervisor: Dr. Sharifullah Khan en_US
dc.description.abstract Daily amount of news reporting in real-world events is growing exponentially, at the same time, people need most important information about any event or any topic in an organized or compact form to make decisions. Document summarization addresses the problem of presenting the information in a compact form to the readers. Di erent approaches to summarize documents have been proposed and evaluated in literature. Common research problems in summarization are redundancy and extraction of sentences; that are important and semantically linked with other sentences. The proposed summarization approach is a combination of agglomerative hierarchical clustering and Latent Semantic Analysis (LSA); which measures the semantic similarity among di erent terms and reduces dimensions by preserving only highly weighted vectors, we propose a novel multi document summarization approach. To identify important terms in our summary, we have used Latent Dirichlet Allocation Model (LDA). LDA is a generative statistical model which allows a set of observations to be explained by a set of small number of topics, where the presence of each word is attributable to the topics of the documents. We have used Recall Oriented Understudy for Gisting Evaluation (ROUGE) metric for the evaluation of our system against other state-of-the art techniques using Document Understanding Conference (DUC) dataset 2004. Experimental results show that there is substantial performance improvement using our system and it makes a coherent summary as compared to the other state-of-art techniques. Our summarization approach improves upon current state-of-the-art summarization systems on mainstream evaluation datasets. en_US
dc.publisher SEECS, National University of Sciences and Technology, Islamabad en_US
dc.subject Information Technology en_US
dc.title Summarization of Opinions from Multiple Documents en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [435]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account