Abstract:
COVID-19 was discovered to be an infectious and potentially fatal viral disease, and its quick and extensive spread has turned it into one of the world’s
most critical problems. People across the globe were facing an alarming
threat due to limited resources, especially in developing countries. Prediction models incorporating multivariate regression to assess the risk of infection have been designed. Some other models incorporate symptoms-based
predictions but with limited and incomplete sets of clinical symptoms. In
this thesis work, we proposed a machine learning approach in which we will
be able to predict COVID-19 and the severity of its patient. Our model is
trained on 6000 clinical records from Holy Family Hospital Rawalpindi and
AJK Health Department Pakistan, in which 3000 patients were tested pos itive. 1365 of the 3000 patients were in serious condition. The proposed
model utilized ten features including cough, fever, sore throat, shortness
of breath, headache, flu, body ache, loss of taste&smell, and diarrhea. To
measure the performance of the model, predictive analysis employs the AUC
curve and average precision (AP). The Shapley additive explanations (SHAP)
have been utilized for descriptive analysis to investigate the most sensitive
features. Machine Learning model random forest outperformed with AUC:
(AP=0.98) among other models like Support Vector, KNN, and Logistic Regression. Our approach demonstrates significant prediction accuracy and can be implemented as a COVID-19 screening tool as well as a technique
to identify the severity of this disease. The proposed methodology can be
utilized to prioritize testing and evaluation purposes for future investigations
and insights