Abstract:
Chest diseases can be fatal and millions of people get infected by them each year.
Recent pandemic has exponentially increases death rate due to pneumonic lung failures. Timely and accurate detection of pulmonary diseases can save millions of lives
around the globe. For the initial screening of pulmonary diseases, Chest x-ray (CXR)
is considered to be the first and foremost diagnostic radiology technique. Moreover,
it is widely available and is adopted to the remotest corner of the world due to its
economic feasibility, accessibility and easy procedure. On the other hand, interpreting CXR to find anomalies in the thoracic region is a tedious job and can consume an
ample amount of radiologist’s time when there are thousands of them to process. In
such scenarios, the Computer-Aided Diagnostic (CAD) systems can help radiologists
by doing the trivial processing and presenting the information in a meaningful way
so that, the radiologist can make more accurate decisions by spending less amount of
time and energy. Most of the existing CAD systems are based on publicly available
datasets having few hundred images and thus cannot be generalized on large scale,
while very few of them have utilized large scale datasets. Apart from the number
of images, there is a large variability within images being from different vendors
and geographical regions which effects the quality of such CAD systems. Another
challenge in the domain is radiology report generation from single CXR to support
radiologists burden. Keeping these challenges and limitations in mind, the proposed research work presents a framework which includes a custom CNN model for
i)improved pulmonary disease identification and classification ii) cross disease modal
learning and iii) automated report generation by learning physician’s reports along
with CXR images using attention and Long Short Term Memory modules (LSTM).
The classification module proposes a customized version of InceptionResNetV2 to
handle textural changes of CXR for different manifestations. In order to have a
generalized model to handle varying lung diseases and large variety of datasets, we
have proposed incremental model. This module enhances the proposed model to incrementally learn the disease representation over various datasets without forgetting
the previously learned representation. For this purpose, a new ensemble loss function including mutual distillation loss is introduced to avoid catastrophic forgetting
during incremental learning. The last module of proposed framework deals with
radiology report generation from input CXR image. It uses proposed backbone to
extract vision features and then couples it with attention head and LSTM layers to
generate final reports. The evaluation is done using different publicly available CXR
datasets i.e. NIH CXR dataset, the Indiana University dataset, JSRT, Shenzhen,
Zhang and the locally gathered Health Ways dataset. Three level of experiments
are conducted to evaluate the classification module using NIH and local datasets.
In first stage, binary classification is done to separate no finding samples from abnormal ones. It achieves 98.18% and 94.91% accuracy for NIH and local datasets
respectively. The second level of experiment further subdivides abnormal cases into
pneumonia related manifestations with average accuracy of 86.35% and 88% for NIH
and local datasets respectively. Third level of experiments are conducted on NIH
dataset for all 14 manifestations where it achieves an average accuracy of 82.74%
and average area under the curve of 86%. The incremental learning based module
is tested on above mentioned publicly available datasets and comparison between
existing incremental learning frameworks show that proposed loss function outperforms existing techniques. The report generation module is evaluated using Indiana
university dataset which is one of the most widely used dataset for CXR report generation. Finally, the module is also evaluated on the locally gathered Health Ways
dataset to find the localized patterns of pulmonary diseases and report generation.
The results show that proposed framework outperformed existing state of the art
both in terms of disease classification and report generation.