Computational Diagnostic Systems for Lung Diseases Through Deep Learning and NLP based Automated Medical Reporting

Sirshar, Mehreen

DSpace Home
→
E-Theses
→
CEME
→
Computer Software Engineering
→
PhD
→
View Item

dc.contributor.author	Sirshar, Mehreen
dc.date.accessioned	2023-07-27T12:20:10Z
dc.date.available	2023-07-27T12:20:10Z
dc.date.issued	2022
dc.identifier.other	NUST201590286PCEME1115S
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/35241
dc.description	Supervisor: Dr. Muhammad Usman Akram Co-Supervisor Dr. Shoab Ahmed Khan	en_US
dc.description.abstract	Chest diseases can be fatal and millions of people get infected by them each year. Recent pandemic has exponentially increases death rate due to pneumonic lung failures. Timely and accurate detection of pulmonary diseases can save millions of lives around the globe. For the initial screening of pulmonary diseases, Chest x-ray (CXR) is considered to be the first and foremost diagnostic radiology technique. Moreover, it is widely available and is adopted to the remotest corner of the world due to its economic feasibility, accessibility and easy procedure. On the other hand, interpreting CXR to find anomalies in the thoracic region is a tedious job and can consume an ample amount of radiologist’s time when there are thousands of them to process. In such scenarios, the Computer-Aided Diagnostic (CAD) systems can help radiologists by doing the trivial processing and presenting the information in a meaningful way so that, the radiologist can make more accurate decisions by spending less amount of time and energy. Most of the existing CAD systems are based on publicly available datasets having few hundred images and thus cannot be generalized on large scale, while very few of them have utilized large scale datasets. Apart from the number of images, there is a large variability within images being from different vendors and geographical regions which effects the quality of such CAD systems. Another challenge in the domain is radiology report generation from single CXR to support radiologists burden. Keeping these challenges and limitations in mind, the proposed research work presents a framework which includes a custom CNN model for i)improved pulmonary disease identification and classification ii) cross disease modal learning and iii) automated report generation by learning physician’s reports along with CXR images using attention and Long Short Term Memory modules (LSTM). The classification module proposes a customized version of InceptionResNetV2 to handle textural changes of CXR for different manifestations. In order to have a generalized model to handle varying lung diseases and large variety of datasets, we have proposed incremental model. This module enhances the proposed model to incrementally learn the disease representation over various datasets without forgetting the previously learned representation. For this purpose, a new ensemble loss function including mutual distillation loss is introduced to avoid catastrophic forgetting during incremental learning. The last module of proposed framework deals with radiology report generation from input CXR image. It uses proposed backbone to extract vision features and then couples it with attention head and LSTM layers to generate final reports. The evaluation is done using different publicly available CXR datasets i.e. NIH CXR dataset, the Indiana University dataset, JSRT, Shenzhen, Zhang and the locally gathered Health Ways dataset. Three level of experiments are conducted to evaluate the classification module using NIH and local datasets. In first stage, binary classification is done to separate no finding samples from abnormal ones. It achieves 98.18% and 94.91% accuracy for NIH and local datasets respectively. The second level of experiment further subdivides abnormal cases into pneumonia related manifestations with average accuracy of 86.35% and 88% for NIH and local datasets respectively. Third level of experiments are conducted on NIH dataset for all 14 manifestations where it achieves an average accuracy of 82.74% and average area under the curve of 86%. The incremental learning based module is tested on above mentioned publicly available datasets and comparison between existing incremental learning frameworks show that proposed loss function outperforms existing techniques. The report generation module is evaluated using Indiana university dataset which is one of the most widely used dataset for CXR report generation. Finally, the module is also evaluated on the locally gathered Health Ways dataset to find the localized patterns of pulmonary diseases and report generation. The results show that proposed framework outperformed existing state of the art both in terms of disease classification and report generation.	en_US
dc.language.iso	en	en_US
dc.publisher	College of Electrical & Mechanical Engineering (CEME).NUST	en_US
dc.title	Computational Diagnostic Systems for Lung Diseases Through Deep Learning and NLP based Automated Medical Reporting	en_US
dc.type	Thesis	en_US