Development of Content-Based News Classifi cation, Analysis and Recommendation System

Mushtaq, Saqib

DSpace Home
→
E-Theses
→
SEECS
→
Computer Science
→
MS
→
View Item

Development of Content-Based News Classifi cation, Analysis and Recommendation System

Mushtaq, Saqib

URI: http://10.250.8.41:8080/xmlui/handle/123456789/8261

Date: 2015

Abstract:

Web-blogs or news articles provide a large data set that can be used to classify information such as author's point of view, a liation or biasedness. While user reads some article, they have to acquire about author's point of view if he/she is speaking positive or negative about the topic under discussion. News articles these days are full of violence, hate speech and biasedness of the author's view. There is need to analyze the articles to classify on the ba- sis of positive or negative content authors produce or if author uses extreme words or hate speech about any speci c group. The main idea is to give the pure mentality of author to the user. In this research we have proposed a solution to achieve this objective by performing sentiment analysis using CoreNLP API by Stanford. We have cal- culated sentiment value of each news using di erent parameters and through experimentation we concluded that using 60% sentiment value of the text and 40% of the headline produces results nearest to human evaluation. We have also proposed a new algorithm Recursive Cosine for similarity match- ing of news. Our algorithm uses classical cosine similarity in a di erent way. We compared latest news articles with the one's published in past within a certain time frame and we have achieved 5% better accuracy than classical cosine measure for similarity by using the newly constructed algorithm.