Disease Classification Based on Clinical Notes

Ashraf, Sundas

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
MS
→
View Item

dc.contributor.author	Ashraf, Sundas
dc.date.accessioned	2023-07-19T09:54:05Z
dc.date.available	2023-07-19T09:54:05Z
dc.date.issued	2023
dc.identifier.other	364512
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/34829
dc.description	Supervisor: Dr. Usman Qamar	en_US
dc.description.abstract	Effective disease classification plays a crucial role in healthcare for accurate diagnosis, treatment, planning, and patient management. With the increasing adoption of electronic health records (EHRs), there is a vast amount of clinical data available that can potentially be leveraged for disease classification. Due to the rapid growth in the volume of clinical data generated, Healthcare providers face a significant challenge in extracting meaningful insights from electronic health records. In this regard Natural Language Processing techniques can assist in identifying and extracting important clinical information from these records and assist healthcare practitioners for accurate diagnosis. The study utilizes a large dataset of EHRs from a diverse patient population, encompassing a wide range of diseases and medical conditions. Natural language processing (NLP) techniques are employed to extract and preprocess clinical notes, ensuring the removal of un-necessary patient information while retaining the essential clinical details. Feature engineering is applied to transform the unstructured clinical text into a structured representation suitable for machine learning algorithms. A variety of machine learning models, including Support Vector Machines (SVM), Passive Aggressive Classifier, Naïve Bayes and Logistic Regression are trained and evaluated on the dataset. Performance metrics such as accuracy, precision, recall, and F1 score are used to assess the classification models' effectiveness in accurately predicting the presence or absence of specific diseases based on the clinical notes. The results demonstrate that the proposed disease classification system achieves high accuracy of 98% across multiple diseases using SVM classifier. The research demonstrates the effectiveness of machine learning models in accurately classifying diseases based on these clinical notes. The system's accuracy and performance highlight its potential for enhancing healthcare delivery and decision-making, contributing to improved patient care and outcomes. Key Words: Natural Language Processing, Text Classification, Feature Engineering, Support Vector Machine	en_US
dc.language.iso	en	en_US
dc.publisher	College of Electrical and Mechanical Engineering (CEME), NUST	en_US
dc.title	Disease Classification Based on Clinical Notes	en_US
dc.type	Thesis	en_US