dc.contributor.author |
Rida Hafeez |
|
dc.date.accessioned |
2020-12-09T06:36:26Z |
|
dc.date.available |
2020-12-09T06:36:26Z |
|
dc.date.issued |
2017 |
|
dc.identifier.uri |
http://10.250.8.41:8080/xmlui/handle/123456789/17171 |
|
dc.description |
Supervisor: Dr. Sharifullah Khan |
en_US |
dc.description.abstract |
Preprocessing is an essential and primary step in automatic taxonomy generation for text documents because text data is unstructured; and more inconsistent and noisy than structured data. Different taxonomy generation systems involve different preprocessing steps during generation. However, there is no existing benchmark mark to analyze the impact of preprocessing techniques to improve the quality of taxonomy. To overcome this deficiency, a new methodology is proposed to study the comparative analysis of various preprocessing techniques and to evaluate the quality of generated taxonomy. Different combinations of preprocessing techniques have been selected and applied in generating taxonomy to amplify pertinent information for further analysis and processing. This research investigates the impact of various preprocessing techniques on the quality of the generated taxonomy and proposed a comparative analysis on the basis of various evaluation matrices. Various combinations of preprocessing techniques have been applied in taxonomy generation on two text data sets, selected from different domains i.e., ACM and MEDLINE. The experimental results revealed that selecting a suitable combination of preprocessing techniques can improve the quality of automated taxonomy. However applying all preprocessing techniques in the generation process does not guarantee high quality. The experiments were conducted on document based taxonomy however, in future, the scope of research can be extended to concept based taxonomy as well. |
en_US |
dc.publisher |
SEECS, National University of Sciences and Technology, Islamabad |
en_US |
dc.subject |
Information Technology |
en_US |
dc.title |
The Impact of Pre-Processing On Automated Taxonomy Generation |
en_US |
dc.type |
Thesis |
en_US |