Abstract:
Feature selection and classification are widely used in machine learning to handle the immense
amount of data. In many datasets, conditional attributes and decision classes are preferenceordered and to perform feature selection on these types of datasets, an extension of rough set theory
(RST) is used which is known as a dominance-based rough set approach (DRSA). A dominancebased rough set approach follows a dominance principle which states that objects relating to a
certain decision class must follow the preference order and this preference order states that an
object having higher values of conditional attributes must have higher decision class. The
dependency measure of a dataset is used in DRSA to calculate the suitable reducts of a dataset.
The conventional DRSA uses lower and upper approximations to calculate the dependency of the
dataset. The shortcomings of this conventional method of dependency calculation are high
complexity and huge utilization of computational resources. This paper proposes a novel
methodology named as “Incremental Dominance-based Dependency Calculation” (IDDC) to
mitigate the aforementioned problems regarding the conventional approach of dependency
calculation. The proposed methodology uses an incremental approach to find the dependency of
datasets by scanning the data records one-by-one and comparing each record with every other
record in the dataset. For comparison of records, IDDC uses a set of proposed dominance-based
dependency classes. To justify the proposed approach, both IDDC and conventional approaches
are compared using various datasets from the UCI dataset repository. Results have shown that the
proposed approach outperforms the conventional approach by depicting on average 46% and 98%
decrease in execution time and required runtime memory, respectively.
Keywords: Dominance-Based rough set approach (DRSA), Incremental Dominance-based
dependency calculation Method (IDDC), Dependency classes, Rough set theory (RST), Lower
Approximations, Upper Approximations, Reducts, Fast Reduct Generating Algorithm (FRGA),
UCI repository.