Abstract:
Web-blogs or news articles provide a large data set that can be used to classify
information such as author's point of view, a liation or biasedness. While
user reads some article, they have to acquire about author's point of view
if he/she is speaking positive or negative about the topic under discussion.
News articles these days are full of violence, hate speech and biasedness of
the author's view. There is need to analyze the articles to classify on the ba-
sis of positive or negative content authors produce or if author uses extreme
words or hate speech about any speci c group. The main idea is to give the
pure mentality of author to the user.
In this research we have proposed a solution to achieve this objective by
performing sentiment analysis using CoreNLP API by Stanford. We have cal-
culated sentiment value of each news using di erent parameters and through
experimentation we concluded that using 60% sentiment value of the text
and 40% of the headline produces results nearest to human evaluation. We
have also proposed a new algorithm Recursive Cosine for similarity match-
ing of news. Our algorithm uses classical cosine similarity in a di erent way.
We compared latest news articles with the one's published in past within a
certain time frame and we have achieved 5% better accuracy than classical
cosine measure for similarity by using the newly constructed algorithm.