NUST Institutional Repository

An Unsupervised NLP Approach for Cross-lingual Urdu-English Text Summarization: A Framework

Show simple item record

dc.contributor.author Shafiq, Zertashia
dc.date.accessioned 2024-08-07T11:31:43Z
dc.date.available 2024-08-07T11:31:43Z
dc.date.issued 2024-08-06
dc.identifier.other 363894
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/45260
dc.description Dr. Usman Qamar en_US
dc.description.abstract We live in an era where there is linguistic diversity and global interconnectedness. In order to move forward, the ability to bridge language barriers is paramount to facilitate cross-cultural communication. This research study presents a comprehensive exploration of cross lingual Urdu to English extractive text summarization framework using an unsupervised NLP approach. The framework incorporates a sequence of steps using a language specific manually prepared dataset. It integrates text translation, summarization using TextRank algorithm, Rouge score calculation and sentiment analysis to assist seamless language comprehension and conversion. The motivation behind this research emerges from the vital need to address the linguistic divide in a multilingual society like Pakistan. Here, Urdu serves as a national language, but English also holds a significant importance in various areas especially in a professional and educational background. The primary objective is to develop a framework that will be capable of accurately translating cross lingual content meanwhile preserving a semantic meaning of the context. The framework involves various components, a manually curated dataset that is paired with human generated summaries, along with rouge score in order to assess the accuracy and effectiveness of the framework-generated summaries. The methodology encompasses dataset preparation, text translation, summarization, evaluation using rouge scores calculation, and sentiment analysis to give reader a gist of the overall content sentiment. The findings of this study contribute to the advancement of cross lingual text summarization technologies. en_US
dc.language.iso en en_US
dc.publisher College of Electrical & Mechanical Engineering (CEME), NUST en_US
dc.subject unsupervised NLP, machine learning, text summarization, extractive summarization, cross lingual, TextRank, parallel corpus, translation, sentiment analysis, framework en_US
dc.title An Unsupervised NLP Approach for Cross-lingual Urdu-English Text Summarization: A Framework en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [441]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account