Abstract:
The great Covid-19 pandemic affected billions of people’s lives personally and socially. The impact on the public’s psychological health were significant as they affected the ways in which people lived, worked, and socialized. It became a hot topic of discussion over the social media platforms as people communicated and expressed their views and detrimental effects on their psychological health. Coronavirus is a new type of infectious disease and to control its rapid spread led to social distancing because of its airborne properties and lack of pharmaceutical measures. Social media is now considered as a main information hub because information is shared over a large scale. People share their emotions and views related to any specific topic through their discussions. The research involves analyzing the people’s views and thoughts shared on Twitter platform related to Covid-19 pandemic and its detrimental or non-detrimental effects on public’s mental health by using machine learning algorithm. Sentiment analysis is a conventional method to explore people's views by browsing through human-generated textual content from online users. The primary objective of the research is to analyze people’s views related to Covid19 pandemic by classifying the Tweets collected from the social platform, Twitter. The accuracy of the classification method is enhanced by using the word embedding approach. Deep learning embedding models like BERT and its variants have been employed to generate high-dimensional word vectors to conserve the semantic information of words. These word vectors are then employed to train the model for the classification of the tweet in five sentiments. As a result, tweets are classified as Positive, Extremely Positive, Negative, Extremely Negative, and Neutral. The methodology is tested on publicly available Tweets dataset on Kaggle, which was split into 90:10 ratio as training and testing sets respectively. The BERT and MiniLM uncased classification models among all the models achieved highest accuracy of 88% and 93% with the Kaggle dataset. This analysis can assist the medial health authorities to monitor health information, conduct, and plan interventions to lower the pandemic effect and can help government to take precautionary measures.