Abstract:
Higher education is becoming increasingly important for the development and success of
individuals. Consequently, a large number of students pursue higher education but some
of them face difficulties that lead to unsatisfactory academic performance; they may even
be at risk of dropping out of their university degrees altogether. So, it becomes important
to detect these entry-level students who are struggling with their academics so that
their issues can be remedied which will, in turn, increase their academic performance.
A real-world dataset of students is consolidated in this study so struggling students
can be identified. T his s tudy also p resents t his classification of st udents th rough datadriven machine learning techniques and identifies s tudents w ho a re s truggling e arly on
in their first t wo s emesters. M ultiple m achine l earning m odels a re t rained, t ested and
their comparisons are presented. The reasons why these students are struggling are also
explored through the use of Shapley values and afterwards, the best performing machine
learning models are recommended for classifying the students who are at risk.