NUST Institutional Repository

Hybridizing Multiple Filters and GA Wrapper for Feature Selection of Microarray Cancer Datasets

Show simple item record

dc.date.accessioned 2023-08-09T07:16:31Z
dc.date.available 2023-08-09T07:16:31Z
dc.date.issued 2019
dc.identifier.other 00000205406
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/35940
dc.description Supervisor: DR. USMAN QAMAR en_US
dc.description.abstract DNA Microarray technology is a valuable advancement in medical field but it gives birth to many challenges like curse of dimensionality, storage and computational requirements. Feature Selection is one way to handle these issues. To overcome the issues and challenges associated with microarray cancer dataset and not to compromise over relevancy, optimality and to improve the performance of metaheuristic Genetic Algorithm based wrappers, in this paper we have proposed, a multiple filters and GA wrapper based hybrid feature selection approach (MFGARF) that incorporates Random forest as fitness evaluator of features. The proposed hybrid approach MF-GARF is comprised of three phases relevancy block; containing information theory based filters Information Gain, Gain Ratio and Gini Index, responsible for ensuring relevancy and removal of irrelevant and noisy features. Second phase is Redundancy block; incorporating Pearson Correlation statistics to remove redundancy among features, and then final phase Optimization Block; containing Genetic Algorithm wrapper with Random Forest as fitness evaluator, responsible for generating an optimal feature subset with high predictive power. Random Forest, kNN, Naïve Bayes and SVM within a 10-fold cross validation setup is used to calculate the classification accuracy of selected optimal feature subset. Experiments are carried out on 7 publically available benchmark binary and multiclass Microarray gene expression cancer datasets and the proposed algorithm has achieved good accuracy with minimal selected features for all datasets. The thorough comparison with other state of the art GA based and other metaheuristic hybrid techniques validates the effectiveness of our proposed approach in terms of features count and classification accuracy. en_US
dc.language.iso en en_US
dc.publisher College of Electrical & Mechanical Engineering (CEME), NUST en_US
dc.subject Key Words: Genetic Algorithm, Microarray Gene Expression Datasets, Feature Selection, Information Gain, Gini Index, Gain Ratio, Correlation, Random Forest, Hybrid, Wrapper, Filter, Microarray Cancer Dataset, Gene Selection en_US
dc.title Hybridizing Multiple Filters and GA Wrapper for Feature Selection of Microarray Cancer Datasets en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [441]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account