NUST Institutional Repository

Semantic Search using Thematic Similarity in Digital Documents

Show simple item record

dc.contributor.author Butt, Madiha
dc.date.accessioned 2020-10-28T10:28:47Z
dc.date.available 2020-10-28T10:28:47Z
dc.date.issued 2015
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/6620
dc.description Supervisor: Dr. Sharifullah Khan en_US
dc.description.abstract Typical semantic-based search systems resolve semantic heterogeneity by augmenting keywords through domain ontology. They consider individual keywords i.e. either concepts or relationships of ontology, but ignore the semantic relationships that exist between keywords. Therefore, to answer complex queries accurately is not possible even augmenting the query’s keyword with different semantic relationships. To find the right document is only possible, if a system knows the meanings of the concepts and relationships that exist among the concepts. The proposed system takes concepts as well as the relationship that exists among them for considering the context. The system performed searching by matching RDF triples rather than individual keywords. The documents are ranked according to their relevance score of triples. To validate the proposed semantic similarity measure, a prototype system has been implemented. The proposed semantic similarity measure uses both the structure of ontology and statistical information content to compute the semantic similarity. By combining a taxonomic structure with empirical probability estimates, it provides a way of adapting a static knowledge structure to multiple contexts. Through RDF triple matching, we have computed context based information retrieval. The proposed system has been evaluated by repeating Charles and Miller experiment and by comparing the proposed measure with several other similarity measures. Experimental results demonstrate better performance over up-to-date similarity measures. We have also evaluated our measure using Pilot Short Text Semantic Similarity Benchmark Data Set (STASIS) and we have obtained 85% correlation with STASIS. In future, we intend to consider the most appropriate sense of a concept to further improve its accuracy. en_US
dc.publisher SEECS, National University of Science & Technology en_US
dc.subject Semantic Search, Thematic Similarity, Digital Documents, Computer Science en_US
dc.title Semantic Search using Thematic Similarity in Digital Documents en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [375]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account