NUST Institutional Repository

Relevant Information Extraction from Twitter during Time-Critical Situations

Show simple item record

dc.contributor.author Zoha Sheikh
dc.date.accessioned 2021-01-21T09:19:28Z
dc.date.available 2021-01-21T09:19:28Z
dc.date.issued 2018
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/21546
dc.description Supervisor: Dr. Sharifullah Khan en_US
dc.description.abstract People suffering during disasters and emergencies; look for quick feedback to their queries. People post irrelevant information and there is a sudden rise in this activity during high impact events. Government and relief organizations look for situational awareness information to launch relief operations but due to increase in irrelevant information, they would not be able to take necessary measures on time. Existing studies explored text or user relevancy has using GloVe, pseudo relevance feedback and rule based approaches in Twitter. GloVe approach has shown very low performance. Existing approaches have used unstructured and redundant tweets and no measure has been taken to remove the redundant tweets. Existing systems have focused on text relevancy or user relevancy independently but none of the system has provided the both. The main objective of this research is to increase the content relevancy and finding out sources most relevant to a topic. A novel approach has been proposed to provide access to the relevant information on Twitter. There are two major parts, first is identifying the relevant tweets and second is identifying relevant sources. In the first part, automated technique has been proposed to make the system dynamic and independent enough to prepare its ground truth. To achieve this automation, genism third party domain specific embeddings are used to expand the initial queries and based on the relevance feedback mechanism relevant messages are shown to the user. This continuous relevance feedback would help in generating the ground truth automatically. In the second part text relevancy score taken from the first part, user specific characteristics of sources and tweet specific characteristics have helped in evaluating the source relevancy by classifying them in to different ranks. Initial experiments and user studies have been performed using a real world disaster dataset that shows the significance of the proposed approach. Evaluation of the system is performed using different measures like mean average precision and Normalized Discounted Cumulative Gain (NDCG). The mean average precision of the proposed system is 89% while the NDCG score is 95%. en_US
dc.publisher SEECS, National University of Sciences and Technology, Islamabad en_US
dc.subject Information Technology en_US
dc.title Relevant Information Extraction from Twitter during Time-Critical Situations en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [432]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account