Sentiment Analysis

Introduction

Sentiment Analysis is part of the more general area of Affective Computing, which seeks for computers to interpret the emotional state of humans and adapt to their behavior, providing them an adequate response to these emotions.

This algorithm evaluates the sentiment polarity (positive or negative) of a given text. It is called “Shertiments,” from ”Sherpa's Sharing Sentiments Algorithm”.

It uses the lexical approach, which analyzes the text locally; that is, word by word or by bi- and tri-grams, at most. At the core of this algorithm, there are several lexicons, which are basically “dictionaries of sentiment,” and a rule-based system.

We prefer the lexical approach because of its efficiency and speed, and because there is no need to train the algorithm, as it is based on existing linguistic knowledge.

The accuracy of this algorithm – both in English and in Spanish texts - has been evaluated with several different {metrics of quality / figures of merit}, and it is in the range of the average inter-subjects agreement. That is, the “opinion” of the algorithm is nearly as similar to the one given by a human rater, as it is between two different human raters.

The algorithm has been developed by a team of engineers and a psychologist, using state-of-the-art techniques.

We use specific Lexicon, dictionaries, and rules for Spanish language, rather than translating the texts in Spanish into English and then feeding them to the English-language version of the algorithm, as it allows us to take advantage of the idiosyncrasy of each language. As stated in [2] by academic experts in Sentiment Analysis: “the best path to long-term improvement is through the inclusion of language-specific knowledge and resources”; and also: “Our evaluation on multi-domain corpora indicates that, although translation and machine learning classification both perform reasonably well, there is a significant cost to automated translation. A language-specific Semantic Orientation Calculator with dictionaries built using words that actually appear in relevant texts gives the best performance, with significant potential for improvement.”

2 J. Brooke, M. Tofiloski, and M. Taboada. Cross-linguistic sentiment analysis: From English to Spanish. In Proceedings of the International Conference RANLP-2009, pages 50–54, 2009. Venue: Borovets, Bulgaria.