NUST Institutional Repository

Text Summarization from Judicial Records using Deep Neural Machines

Show simple item record

dc.contributor.author Ayesha Sarwar
dc.date.accessioned 2022-01-03T11:46:01Z
dc.date.available 2022-01-03T11:46:01Z
dc.date.issued 2021
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/28261
dc.description.abstract The judiciary is the branch of the government whose task is the administration of justice. The courts are generating a large amount of data as legal proceedings. The legal documents are in the form of cases and their judgments. A judgment is a long, and detailed document. To prepare for a case, a lawyer has to read through hundreds of legal documents to find out the relevant judgments. In Pakistan, the ratio of cases that are registered every year and the judgments made is very high mainly due to the time it takes to prepare for a trial. Providing lawyers and judges with the summary of the relevant judgments will not only help them to get an overview without reading the whole judgment but also save a lot of their precious time, and hence more judgments can be made every year. Artificial Intelligence (AI) is finding its application in all domains of our lives. The use of AI techniques can also be helpful in courtrooms. Text Summarization is one of the applications of Natural Language Processing (NLP) which can be used to provide a brief overview of the judgment to both the lawyers and the judges. Transformer-based models in NLP, now-a-days, are a benchmark in solving sequence-to-sequence modelling problems. Therefore, they can be utilized to help legal domain experts save their time for writing judgment summaries in the real world. However, text summarization in legal documents differs from the regular text. The summarization task is dependent on the type of summary that is required. Moreover, the legal documents consist of tens of pages and hence more number of words. Therefore, existing pre-trained models on regular text cannot be helpful. Among other transformer-based models, Longformer has been introduced recently to deal with the long input sequence lengths up to 16, 384 tokens [2]. Training a model with such a configuration demands high computation power. Fine-tuning a pre-trained legal Longformer Encoder-Decoder (LED) on a downstream task showed better accuracy scores on the dataset. en_US
dc.description.sponsorship Dr. Faisal Shafait en_US
dc.language.iso en en_US
dc.publisher SEECS, National University of Science & Technology, Islamabad. en_US
dc.subject MSCS SEECS 2021 en_US
dc.title Text Summarization from Judicial Records using Deep Neural Machines en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [375]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account