NUST Institutional Repository

Automated Anonymization of Court Room Records

Show simple item record

dc.contributor.author Daud, Marium
dc.date.accessioned 2022-04-13T09:17:15Z
dc.date.available 2022-04-13T09:17:15Z
dc.date.issued 2021
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/29131
dc.description.abstract Thousands of cases are registered every year in the Supreme Court and High Courts of Pakistan, and almost all the judgments of these cases are available on their respective websites in the form of PDF documents. Since these doc uments are easily accessible containing personal information of the parties involved such as petitioner and respondent names, organization names and their addresses, anyone can easily intrude on their privacy and can identify the persons and organizations mentioned in the judgments. These documents do not follow a proper format and are semi-structured so extracting personal information of the parties is a difficult task. Automated anonymization of court room records is a solution for de-identifying all the personal information from these documents. Although, unstructured form of these documents and the uncertainty of natural language makes this a challenging task. This re search focuses on extracting personal information of the parties involved and anonymizing them in publicly accessible documents. We used BERT-NER to train and extract three labels from the dataset containing 213 judgements of Supreme Court of Pakistan. We created this dataset by extracting raw judge ments from the Supreme Court of Pakistan’s website and labelled it using our formulated annotation guidelines. The labels we used are Per (Person), Org (organization) and Loc (Location) which after extraction using NER were anonymized by replacing them with generic words. en_US
dc.description.sponsorship Dr. Faisal Shafait en_US
dc.language.iso en en_US
dc.publisher SEECS, National University of Sciences & Technology Islamabad en_US
dc.subject Court Room Records-Automated Anonymization en_US
dc.title Automated Anonymization of Court Room Records en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [375]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account