Abstract:
Using the DNA methylation data present in The Cancer Genome Atlas, we propose a new data preprocessing method where we use the caner driver genes to extract the relevant features from the data. After the preprocessing step we performed a feature extraction method where we selected top 50 features from each of the four sites of the human body. This method of feature extraction method yielded a comparable F-score against other studies while also reducing the overall space complexity of the problem