SENTENCE EXTRACTION BASED AUTOMATIC TEXT

Ayyaz, Sundus

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

dc.contributor.author	Ayyaz, Sundus
dc.date.accessioned	2023-08-18T10:29:19Z
dc.date.available	2023-08-18T10:29:19Z
dc.date.issued	2012
dc.identifier.other	(2010-NUST-MS PhD-CSE(E)-25)
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/36883
dc.description	Supervisor: Dr Muhammad Younus Javed	en_US
dc.description.abstract	The rapid growth of digital data on web has created the problem of information excess. Many users face difficulty to get the required relevant information within time from huge online repository. Automatic text summarization is used to solve this problem by compressing the text into shorter form containing only the meaningful information so that it is not obligatory for user to go through each and every line in document for understanding the core concept behind it. This thesis focuses on the design, implementation and analysis of an optimized fuzzy model by using a feature term based automatic text summarization method based on sentence extraction to generate meaningful summary of scientific documents. Initially, the text document to be summarized is given to the system and the Preprocessing stage removes noise from the input document and produces a clean document. The proposed Model consists of three methods. First is the General Statistical Method (GSM), where feature terms are extracted by paragraph and sentence segmentation which includes further steps of tokenization, stop word removal, case folding and removal of non-essential sentences from document. Based on these identified feature terms; cue words, frequent words and sentence position, weights are assigned and each sentence score is calculated and the high score sentences are extracted. In second method, the Fuzzy Logic Model (FL), the output result from GSM and the identified features are used as an input to Fuzzy inference system (FIS). The FIS, on the basis of fuzzy rule set extracts the most important sentences out of the selected ones to be included in summary. In third method which is the Optimized Fuzzy Model (OFM) the input and output fuzzy parameters as well as the fuzzy rule weights are optimized to get the optimized weight of each feature. Now Page 8 each sentence score is calculated based on these weights and the highly scored sentences are selected to be included in final optimized summary document. The proposed technique is implemented in java using NetBeans IDE 6.9.1 and Jfuzzylogic 2.1a package. In order to evaluate the system, the summaries generated using each of the three methods are tested with the golden standard summary (human-generated summary) and compared with each other as well as with other summarizers such as MS-Word 2007 summarizer and Essential summarizer for the purpose of comprehensive efficiency analysis. The evaluation measurements such as Precision, Recall and F-measure are calculated for each summary generated.	en_US
dc.language.iso	en	en_US
dc.publisher	College of Electrical & Mechanical Engineering (CEME), NUST	en_US
dc.title	SENTENCE EXTRACTION BASED AUTOMATIC TEXT	en_US
dc.type	Thesis	en_US