NUST Institutional Repository

RANKED INFORMATION RETRIEVAL USING WEIGHTED TF IDF7

Show simple item record

dc.contributor.author ANWAR, SALEEM
dc.date.accessioned 2023-08-29T05:50:36Z
dc.date.available 2023-08-29T05:50:36Z
dc.date.issued 2008
dc.identifier.other (2005-NUST-MS PhD-CSE(E)-07)
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/37775
dc.description Supervisor: DR SHOAB AHMED KHAN en_US
dc.description.abstract Ranked Information Retrieval using Weighted TF IDF Document Retrieval is the task of retrieving a relevant Document in response to a query, a question, or a reference Document. Tasks such as question answering, summarization, novelty detection, and information provenance make use of a Document retrieval module as a preprocessing step. The performance of these systems is dependent on the quality of the Document‐retrieval module. Other tasks such as information extraction and machine translation operate on Documents, either using them as training data, or as the unit of input or output (or both), and may benefit from Document retrieval to build a training corpus, or as a post‐processing step. In this thesis we begin by studying IR Model, then we build a through understanding of exiting IR algorithms like TFIDF, Okapi BM25 and Pivoted length normalization to name a few. During the study of the mentioned algorithms we come up with some deficiencies in retrieval algorithms and started working to eradicate those deficiencies. We proposed a better approach for scoring documents named Weighted TF IDF (WTF IDF) instead of TF IDF where terms are counted rather than weighted with respect to locality of documents and term order. More over we planned to cope with different writing styles by looking for synonym query along with original query, this increase the chances of retrieving some novel information from the corpus. We have provided the implementation of exiting algorithms and compare the performance with proposed approach WTF IDF and presented the result. The proposed approach has better results than the exiting ones en_US
dc.language.iso en en_US
dc.publisher College of Electrical & Mechanical Engineering (CEME), NUST en_US
dc.title RANKED INFORMATION RETRIEVAL USING WEIGHTED TF IDF7 en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [443]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account