Abstract:
The exponential growth of social media platforms has revolutionized the way information is shared
and consumed, leading to the widespread dissemination of both factual and false information. The
rapid spread of misleading or entirely false news poses a significant threat to public discourse and
the integrity of democratic processes. The task of accurately classifying the truthfulness of
statements is complex, particularly due to the nuanced and often ambiguous nature of content
shared on social media. Traditional Natural Language Processing (NLP) techniques, such as
Bidirectional Encoder Representations from Transformers (BERT), have demonstrated
proficiency in contextual understanding and text classification. However, these approaches
frequently encounter limitations in accuracy, largely due to their difficulties in managing
imbalanced datasets and the lack of integration with supplementary feature sets. To address these
challenges, this research proposes a novel hybrid model that combines the strengths of BERT with
dependency parsing and integrates a Deep Learning (DL) model designed to process metadata.
This hybrid approach enhances the model's ability to accurately analyze and classify the complex
and varied structures within the dataset, leading to improved overall accuracy. Additionally, this
study explores different network architectures and preprocessing techniques aimed at optimizing
the model's performance. The proposed hybrid model was tested on the LIAR dataset, achieving a
notable 64.6% accuracy, which represents a 13.8% improvement over the previous leading
method, the Fake News Detection Multi-Task Learning (FDML) model. The findings from this
research indicate that the incorporation of richer linguistic features and metadata into classification
models can significantly enhance the effectiveness of fake news detection and categorization on
social media platforms.