NUST Institutional Repository

SENTENCE EXTRACTION BASED AUTOMATIC TEXT

Show simple item record

dc.contributor.author Ayyaz, Sundus
dc.date.accessioned 2023-08-18T10:29:19Z
dc.date.available 2023-08-18T10:29:19Z
dc.date.issued 2012
dc.identifier.other (2010-NUST-MS PhD-CSE(E)-25)
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/36883
dc.description Supervisor: Dr Muhammad Younus Javed en_US
dc.description.abstract The rapid growth of digital data on web has created the problem of information excess. Many users face difficulty to get the required relevant information within time from huge online repository. Automatic text summarization is used to solve this problem by compressing the text into shorter form containing only the meaningful information so that it is not obligatory for user to go through each and every line in document for understanding the core concept behind it. This thesis focuses on the design, implementation and analysis of an optimized fuzzy model by using a feature term based automatic text summarization method based on sentence extraction to generate meaningful summary of scientific documents. Initially, the text document to be summarized is given to the system and the Preprocessing stage removes noise from the input document and produces a clean document. The proposed Model consists of three methods. First is the General Statistical Method (GSM), where feature terms are extracted by paragraph and sentence segmentation which includes further steps of tokenization, stop word removal, case folding and removal of non-essential sentences from document. Based on these identified feature terms; cue words, frequent words and sentence position, weights are assigned and each sentence score is calculated and the high score sentences are extracted. In second method, the Fuzzy Logic Model (FL), the output result from GSM and the identified features are used as an input to Fuzzy inference system (FIS). The FIS, on the basis of fuzzy rule set extracts the most important sentences out of the selected ones to be included in summary. In third method which is the Optimized Fuzzy Model (OFM) the input and output fuzzy parameters as well as the fuzzy rule weights are optimized to get the optimized weight of each feature. Now Page 8 each sentence score is calculated based on these weights and the highly scored sentences are selected to be included in final optimized summary document. The proposed technique is implemented in java using NetBeans IDE 6.9.1 and Jfuzzylogic 2.1a package. In order to evaluate the system, the summaries generated using each of the three methods are tested with the golden standard summary (human-generated summary) and compared with each other as well as with other summarizers such as MS-Word 2007 summarizer and Essential summarizer for the purpose of comprehensive efficiency analysis. The evaluation measurements such as Precision, Recall and F-measure are calculated for each summary generated. en_US
dc.language.iso en en_US
dc.publisher College of Electrical & Mechanical Engineering (CEME), NUST en_US
dc.title SENTENCE EXTRACTION BASED AUTOMATIC TEXT en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [441]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account