Detecting Cross Domain Ambiguity in Requirements Through Natural Language Processing Approach

Khalil, Ibrahim

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

Detecting Cross Domain Ambiguity in Requirements Through Natural Language Processing Approach

Khalil, Ibrahim

URI: http://10.250.8.41:8080/xmlui/handle/123456789/35374

Date: 2023-07

Abstract:

Background: In the Requirements elicitation various techniques are adapted to gather the exact needs of the stakeholders which are usually from different background. These techniques are used to clarify the actual problem being solved. There may also be greater chances of ambiguities in the terms used for the requirements. These terms used by stakeholders may vary their meaning domain to domain which may lead to an undesirable interpretation of the requirements. Aim & Objectives: A project success can be measured/estimated if and only if the initially collected requirements are clear, unambiguous, and well understood. Similarly, the ambiguous or not understandable requirements can lead to the failure or closure of the project in disastrous form. An initial step in the requirement elicitation is usually gathering requirements in natural language. This study analyzes different tools, techniques, and approaches used for detecting ambiguities in natural language requirements, validate the approaches applied for the term’s ambiguity among different domain, and to develop and use more precise approach for terms extraction of different domains, similarity finding and ranking of the ambiguities in their semantics. Methodology: An algorithm ‘Word2Vec’ was found as majority in use for ambiguous word detection in text. This previously used algorithm was replaced by ‘FastText’ algorithm on a same data to identify more suitable approach between them. Ambiguity score of the ambiguous terms were calculated and compared scores of the high ambiguous terms produced by Word2Vec with the score produced by FastText. Results and Conclusion: Data of five different domain were assessed via Word2Vec and FastText algorithms. Ambiguous terms were extracted and then was ranked as per their ambiguity level. The rankings of same term produced by both algorithms were compared and difference in the rankings were calculated. This approach seeks to disambiguate texts and improve the process of software requirements elicitation in natural language.