Abstract:
Adverse Drug Reactions (ADRs) are very common and cause serious consequences to patients.
Detecting them can be a very difficult task. With the increasing popularity of social media
platforms, they have become a hub of data. A lot of data related to identifying potential ADRs can
be found on social media. But extracting useful information from it can be a challenging task as
the data is in unstructured form and has a sheer volume. This study proposes an approach to detect
and list unknown ADRs fromsocial media data using machine learning and NLP based techniques.
The framework utilizes Natural Language Processing (NLP) to automate the discovery of ADRs
mentioned in social media posts. They are then compared to a list of known ADRs to identify
unknown ADRs. The dataset for this study has been self-collected and contains tweets related to
ADRs. Three drugs were shortlisted for this study; Adderall, Xanax, and Prozac. For Adderall and
Xanax, one unknown ADR each was found, whereas, for Prozac, three unknown ADRs were
found. The proposed approach can be used to cater to different problems in addition to identifying
unknown ADRs in the future. This study improves patient safety by providing a new approach to
detect unknown ADRs from tweets, contributing to the field of pharmacovigilance.
Keywords - Adverse Drug Reactions (ADRs), Social Media, Natural Language Processing (NLP), Word Embeddings, Word2Vec Model, Cosine Similarity