Unsupervised Feature Selection Using Incremental Dependency Classes Based on Rough Set Theory

Abid, Sana

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

Unsupervised Feature Selection Using Incremental Dependency Classes Based on Rough Set Theory

Abid, Sana

URI: http://10.250.8.41:8080/xmlui/handle/123456789/36035

Date: 2020

Abstract:

Feature selection is the technique applied in data mining to extract relevant features for better understanding of data. Many researchers have applied feature selection in supervised learning but it becomes a challenging task for unsupervised learning due to the absence of class labels. The concept of selecting features using attributes dependency based on rough set theory has recently gained popularity. Unsupervised Incremental dependency classes (UIDC) is an algorithm which calculates the dependency of attributes by eliminating the positive region. In this thesis, we have proposed UIDC for the unsupervised datasets to calculate the dependency of attributes as we add, delete or merge new records. The absence of decision attribute and class labels in unsupervised datasets causes the problem of calculating attribute dependency where decision attribute is involved in every step. UIDC has been applied successfully on the unsupervised datasets generating positive results. The dependency formula is performed on the unlabeled datasets to extract features and calculate attributes dependency. Unsupervised datasets from UCI and kaggel are used where normalization and preprocessing techniques are applied for better performance. Furthermore, parallel computing is applied to minimize execution time by almost 50%. Maximum classification accuracy is achieved by comparing the results obtained with conventional methods.