Abstract:
In the era of Artificial Intelligence (AI) and Intelligent Computing, the revolution in the
healthcare sector is underway. Studies have focused on the development of decision-support
processes for healthcare professionals to enhance disease screening and diagnostics. This
study proposes a decision support system for diagnosing celiac disease (CD) using primary
data, a complex autoimmune disorder affecting millions worldwide. CD diagnosis is
complicated by socioeconomic factors, healthcare disparities, and limited access to advanced
facilities and diagnostic technologies. Conventional methods are cost-prohibitive and lack of
awareness contributes to underdiagnoses or misdiagnoses in developing countries.
The study focused on improving detection rates of CD by utilizing AI-based approaches.
The study aimed to ensure that no cases of CD go undetected and minimize the risk of
misdiagnosing celiac cases as non-celiac. The experimentation phase employed 5 automated
classifiers available in Google Colab Notebook: decision trees, Bayesian classifier, XGBoost
algorithm, support vector machine, and logistic regression. The assessment parameters
considered encompassed accuracy, sensitivity, specificity, and the area under the ROC curve
(AUC). These models were selected for their proven ability to handle both continuous and
categorical data, including categorical dependent variables, within a classification task.
Additionally, considering the limitations of previous applications of AI-based diagnostic
methods, comprehensive data preprocessing and feature engineering techniques have been
introduced including the application of Recursive Feature Elimination (RFE). Among the
array of AI models examined the XGBoost classifier showed the highest accuracy of 97.0%,
a sensitivity of 0.98, a false-negative ratio of 1 and an AUC of 0.91. The outcomes of the
study are helpful in terms of a step towards the development of a smart clinical decision
support system (CDSS). Future directions include market validation of the proposed process
and transformation into a smart application for ease of adoption for the end users. The study's
methodology, which encompasses primary data collection, robust preprocessing, and
meticulous feature engineering, not only enhances predictive accuracy but also establishes a
pioneering CD data repository. This repository, brimming with comprehensive patient
information, is poised to reshape CD research. In essence, the study's detailed results
xv
underscore the transformative potential of AI-driven diagnostic approaches in tackling the
complexities of celiac disease.