dc.description.abstract |
Heart disease is a very serious medical condition. According to World Health Organiza tion about 17.9 million deaths were caused by cardiovascular diseases (CVD) in 2019.
Coronary heart disease (CHD) and stroke are forms of CVD. To avoid complications
and cardiac arrests, timely diagnosis of CVD is vital. However, many factors such as
lifestyle, activity level, diabetes, smoking, cholesterol level and even family history af fects have to be taken into account while predicting CHD. Hence the diagnosis is not
only difficult but also very expensive. To cater such problems, Machine learning (ML)
are being used. But one of the major challenges is the limited amount of data with
the presence of significant class imbalance. This study proposes an efficient three step
solution using feature weight assessment, data sampling and ML models. The National
Health and Nutritional Survey (NHANES) data, which is highly imbalanced such that
the ratio of CHD to Non-CHD cases is almost 1:28, is used. Despite the remarkable
data imbalance, the architecture with the Elastic Net for weight assessment of features,
followed by SMOTETomek sampling and sub-sampling methods and shallow convolu tion neural network (CNN) provides balanced results for both the classes. Achieving
an accuracy of 94%, and precision and recall of 0.91, 0.97 and 0.96, 0.90 for CHD and
Non-CHD classes respectively. |
en_US |