dc.description.abstract |
Extractive summarization involves selecting and condensing key information from a text
document, while preserving the overall meaning and coherence of the original content.
There are several extractive summarization methods that have their own benefits and
drawbacks. Despite the variety of approaches currently available, none of them are flaw less and there is still potential for advancement in the field of automated summarization.
One promising approach to extractive summarization is the use of deep learning models,
such as BERT. BERT is a multilayer transformer network that has been pretrained on
a large dataset for a variety of self-supervised applications, including language transla tion, question answering, and natural language understanding. However, BERT has a
limitation in terms of input length, which makes it less suitable for summarizing long
documents. In this study, we suggest an innovative approach that enables the use of
BERT for long document summarization. Our method involves dividing the document
into smaller chunks, each containing a single sentence. We then use BERT to generate
sentence embeddings, and apply an encoder-decoder model on top of these embeddings
to generate a summary. The encoder-decoder model is a type of neural network that is
commonly used for machine translation and text generation tasks. We carried out exper iments with two scholarly datasets, arXiv and PubMed, to evaluate the effectiveness of
our approach. The results showed that our technique consistently outperformed several
state-of-the-art models for extractive summarization. This demonstrates the potential
of our method for improving the efficiency and accuracy of summarization tasks. |
en_US |