Abstract:
Sentiment classification concerned with the automated techniques that predict the polar
orientation of the text. It is an important and sub-research area of the opinion mining and text
mining, with applications and benefits on different areas including customer recommender and
feedback analysis, business intelligence, information retrieval and social well beings services.
English language lexical resource SentiWordNet have the highest no of lexicons where each
synset (sets of synonyms) is labeled with subjective and objective numerical scores for sentiment
information. It is specifically designed to assist opinion mining tasks. By using such readily
available resource more effective sentiment analysis methods can easily build with the help of
this sentiment biased information.
This research specifically used the SentiWordNet to put a solution for automatic sentiment
classification problem on multi domain sentiment dataset of product reviews and polarity dataset
of movie reviews. At first, sentiment features were collected from subjective terms of
SentiWordNet and used in machine learning based sentiment classification. Due to limitation of
subjective terms in SentiWordNet, text with null or few sentiment features could reflect
ambiguous or null sentiments. We proposed a new dimension of content specific features i.e.
syntactic noun and verb phrases along unigrams features, used to reinforce the performance of
sentiment feature based classifier on the underlying reviews. Different scenarios in features
combinations were executed to find the best representative features also with F-Score based
feature selection to reduce dimensionality.
The obtained results are compared to other documented methods discussed in the literature. It
was highlighted that obtained results of sentiment features along content specific features outer
perform the results of similar approaches used on same data set of reviews. It indicates that
content specific verb and noun phrases features could become a new dimension for sentiment
classification.