Generation of Domain Ontologies from Text

Qadir, Bushra

DSpace Home
→
E-Theses
→
SEECS
→
Computer Science
→
MS
→
View Item

dc.contributor.author	Qadir, Bushra
dc.date.accessioned	2020-11-02T09:30:19Z
dc.date.available	2020-11-02T09:30:19Z
dc.date.issued	2015
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/8281
dc.description	Supervisor: Dr. Sharifullah Khan	en_US
dc.description.abstract	Development of high-throughput experimental techniques and computational models have accelerated the pace of research and development in the field of bio-medicine. A large number of genes and proteins are analyzed at a given time, with an aim to obtain new findings about diseases in order to improve human health. This has resulted in an exponential growth in the field of molecular biology. The knowledge of molecular interactions between genes and transcription factors is of huge interest for a biologist. However, most gene interactions are scattered throughout scientific literature, which is written in natural language and difficult to be directly processed with computers. Traditional search engines provide modest help as they return thousands of relevant documents. The user still has to read all those returned documents to get the information they need. It is becoming more and more difficult to discover required knowledge without utilizing information extraction techniques. The existing approaches that extract gene interactions from bibliographical resources have some limitations that need to be addressed. They are limited to single interaction relations, where a single keyword is used to express relationship between the entities involved. The current relation extraction systems ignore the sentences with multiple interaction keywords. Moreover, they also ignore sentences which contain regulatory information but there is no explicit relationship keyword used in the sentence. This results in extraction errors. In this research work we propose a rule based extraction system that can automatically extract relations between entities such as genes and transcription factors, from biomedical text and present the distilled information in a structured and concise form to users. Our approach uses rules based on regular expressions over annotations to cater the limitations of existing approaches. To validate the proposed methodology, a prototype system has been implemented. The system has been evaluated against a gold standard annotation set and also compared with existing systems. The experimental results show improvement in accuracy, with an average precision of 82.3% and average recall of 89.9%. In future, we intend to incorporate the coreference resolution technique into our system to further improve its accuracy.	en_US
dc.publisher	SEECS, National University of Science & Technology	en_US
dc.subject	Generation, Domain, Ontologies, Computer Science	en_US
dc.title	Generation of Domain Ontologies from Text	en_US
dc.type	Thesis	en_US