NUST Institutional Repository

Modeling Significant Characteristics of Complete Blood Count Reports for Screening of Leukemia using Machine Learning Methods

Show simple item record

dc.contributor.author Qureshi, Hira
dc.date.accessioned 2021-11-17T08:46:34Z
dc.date.available 2021-11-17T08:46:34Z
dc.date.issued 2021-09-06
dc.identifier.other RCMS003293
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/27450
dc.description.abstract Leukemia is an abnormal clonal proliferation of hematopoietic stem cells that affects the bone marrow and lymphatic system. Despite the availability of diagnostic tests, the mortality rate of leukemia is increasing, especially in developing countries with insufficient healthcare facilities. One possible reason may be late or misdiagnosis majorly due to painful procedure of sample collection and expensive diagnostic tests. Therefore, there is a need to improve efficiency of early screening through inexpensive tests like Complete Blood Count (CBC) test. This can be achieved by supplementing the usual subjective assessment of medical practitioners through objective data driven models. For this purpose, a secondary data set of 287 CBC reports has been used with 210 disease/leukemic and 67 control/non-leukemic cases. For classifications, various combinations of features have been modeled using different machine learning methods like Support Vector machine (SVM), Decision Tree (DT) and Random Forest (RF). These combinations include biologically as well as statistically significant features. For the assessment of developed models, a stratified 10-fold cross validation is used with measures like precision, accuracy, recall, F-1 score and specificity. The study concludes that RF method is adequate with 12 features to predict state of the subject. These features are Haemoglobin, Haematocrit, Red Blood Cell Count, Monocyte Percent, Platelet Count, Neutrophil Percent, Monocyte Count, Eosinophil Percent, White Blood Cell Count, Lymphocyte Percent, Mean Corpuscular Volume and Lymphocyte Count. Therefore, the proposed process can be helpful to medical practitioners or pathologists for screening leukemic patients using numerical estimates of CBC features. en_US
dc.description.sponsorship Dr. Zamir Hussain en_US
dc.language.iso en_US en_US
dc.publisher RCMS NUST en_US
dc.subject Blood Count, Screening of Leukemia, Machine Learning Methods en_US
dc.title Modeling Significant Characteristics of Complete Blood Count Reports for Screening of Leukemia using Machine Learning Methods en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [159]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account