A Framework for Clone Detection in UML Models

Irshad, Ayesha

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

dc.contributor.author	Irshad, Ayesha
dc.date.accessioned	2024-07-23T10:14:22Z
dc.date.available	2024-07-23T10:14:22Z
dc.date.issued	2024-07-23
dc.identifier.other	-328534
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/44887
dc.description	Supervisor : Dr. Farooque Azam	en_US
dc.description.abstract	Clone detection in software engineering has a fundamental role in ensuring the quality and maintainability of software systems. Developers often reuse several components of code in their software and code review to identify clones or refactoring of copied code is often neglected resulting in code clones. These cloned components can cause several consistency, bug propagation, maintainability, and quality issues. UML models are the essential artifacts usually in the initial phases of the process of software development, to specify and visualize the software design. These models serve as a blueprint to guide throughout all the phases of software development. Therefore, if there are clones in these UML models they will induce clones in further stages of software development as well. Therefore, these clones will propagate and amplify the clone-related issues from the basic to the final stages of software development. For this reason, it is equally essential to identify, track, and remove the duplicates in UML models as in code. Furthermore, a key goal of Model Driven Software Engineering (MDSE) is to generate code from models such as UML modes. Consequently, increasing the importance of Model clone detection. This study focuses on the application of Natural Language Processing (NLP) to detect clones within UML models. Initially, a UML model is created and clones are induced in the diagram. The model is exported in Extensible Markup Language (XML) format to represent the model in textual form. In the next step, the XML code is parsed to extract the relevant features of the model for clone detection purposes. Since the XML code of UML diagrams carries a lot of structural information that is irrelevant for clone detection and is also not balanced. Therefore, the extracted features are further preprocessed to represent them in a suitable format. Furthermore, the extracted data is labeled to represent clone and nonclone pairs. Moreover, for the detection of clones Natural Language processing techniques are used since the naming and representation of properties of elements of UML models are mostly in textual format. Therefore, NLP techniques can efficiently detect clones in UML Models. The proposed framework is applied to several case studies. These case studies validate the effectiveness of our approach in model clone detection.	en_US
dc.language.iso	en	en_US
dc.publisher	College of Electrical & Mechanical Engineering (CEME), NUST	en_US
dc.subject	MDSE (Model Driven Software Engineering), UML (Unified modeling language), State Machine (SM), NLP (Natural Language Processing), Extensible Markup Language (XML)	en_US
dc.title	A Framework for Clone Detection in UML Models	en_US
dc.type	Thesis	en_US