Abstract:
realizes his or her own potential, can cope with the normal stresses of life, can work
productively and fruitfully, and is able to make a contribution to her or his community”.
Mental disorders not only affect the mental attributes of individuals like management of
emotions, ability to concentrate and interaction with others but it also affects physical
health of individuals. The contributing factors listed by WHO that are considered as
determinants of mental disorders are genetics, stress, prenatal infections, exposure to
environmental hazards, standards of living, working conditions, and community support.
Health systems are yet not able to address the burden of mental disorders. Most of the
people with disorders live their whole lives with no diagnosis or correct treatment rather
it leads to increasing rate of suicide. More than 90% of people who commit suicide
have a pre-existing diagnosis of depression. To address the issue many researchers have
worked in automatic detection of mental disorders to help practitioners diagnose and
carry out correct treatment. This cannot replace mental health professional for obvious
reasons. Recently social media has been a widely used network that connects people
around the world. Not only this but people sharing their life events, thoughts through
posts, status updates all gather up as a big data resource. This resource is helpful in
conducting various researches, analyses including big data and machine learning.
In this study, we analyzed six mental health issues using Reddit’s data. The data
obtained summarizes; Depression, Anxiety, Bipolar, Bipolar Disorder, Schizophrenia,
Autism and Mental Health which is a general class which discusses mental health. The
data gathered included text of the users’ reddit post and title of the respective post.
Experimentation is done using various deep learning and NLP techniques applied for
classification. The first phase of experimentation included data preprocessing and feature extraction using GloVe embedding. The second phase included deep learning techniques
such as Convolutional Neural Network, Long-short term memory network, Gated
Recurrent Unit, Bi- Long-short term memory network and Bi- Gated Recurrent Unit.
In addition to these traditional techniques, pre-trained BERT model and RoBERTa
model have been applied. Finally a hybrid framework is presented using hierarchical
classification and pre-trained RoBERTa fine tuned on the respective mental health data.
The last phase compares results of the baseline deep learning models with the presented
framework. The results show that the average accuracy of the hierarchical classification
with two level hierarchy gives 84% of accuracy on test data. Moreover the results are
compared to the pre-trained RoBERTa which gives 82% of accuracy on test data.