Improving Extractive Summarization of Scholarly Documents using BERT and BiGRU

Bano, Sheher

DSpace Home
→
E-Theses
→
SEECS
→
Computer Science
→
MS
→
View Item

dc.contributor.author	Bano, Sheher
dc.date.accessioned	2023-05-23T04:59:57Z
dc.date.available	2023-05-23T04:59:57Z
dc.date.issued	2023
dc.identifier	317550
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/33456
dc.description	Supervisor: Dr. Shah Khalid
dc.description.abstract	Extractive summarization involves selecting and condensing key information from a text document, while preserving the overall meaning and coherence of the original content. There are several extractive summarization methods that have their own benefits and drawbacks. Despite the variety of approaches currently available, none of them are flaw less and there is still potential for advancement in the field of automated summarization. One promising approach to extractive summarization is the use of deep learning models, such as BERT. BERT is a multilayer transformer network that has been pretrained on a large dataset for a variety of self-supervised applications, including language transla tion, question answering, and natural language understanding. However, BERT has a limitation in terms of input length, which makes it less suitable for summarizing long documents. In this study, we suggest an innovative approach that enables the use of BERT for long document summarization. Our method involves dividing the document into smaller chunks, each containing a single sentence. We then use BERT to generate sentence embeddings, and apply an encoder-decoder model on top of these embeddings to generate a summary. The encoder-decoder model is a type of neural network that is commonly used for machine translation and text generation tasks. We carried out exper iments with two scholarly datasets, arXiv and PubMed, to evaluate the effectiveness of our approach. The results showed that our technique consistently outperformed several state-of-the-art models for extractive summarization. This demonstrates the potential of our method for improving the efficiency and accuracy of summarization tasks.	en_US
dc.description.sponsorship	Dr. Shah Khalid	en_US
dc.language.iso	en	en_US
dc.publisher	School of Electrical Engineering and Computer Sciences (SEECS) NUST	en_US
dc.title	Improving Extractive Summarization of Scholarly Documents using BERT and BiGRU	en_US
dc.type	Thesis	en_US