Abstract
We present an affective text analysis model that can directly estimate and combine affective ratings of multi-word terms, with application to the problem of sentence polarity/semantic orientation detection. Starting from a hierarchical compositional method for generating sentence ratings, we expand the model by adding multi-word terms that can capture non-compositional semantics. The method operates similarly to a bigram language model, using bigram terms or backing off to unigrams based on a (degree of) compositionality criterion. The affective ratings for {\rm n}-gram terms of different orders are estimated via a corpus-based method using distributional semantic similarity metrics between unseen words and a set of seed words. {\rm N}-gram ratings are then combined into sentence ratings via simple algebraic formulas. The proposed framework produces state-of-the-art results for word-level tasks in English and German and the sentence-level news headlines classification SemEval'07-Task14 task. The inclusion of bigram terms to the model provides significant performance improvement, even if no term selection is applied.
Original language | English |
---|---|
Article number | 6578101 |
Pages (from-to) | 2379-2392 |
Number of pages | 14 |
Journal | IEEE Transactions on Audio, Speech and Language Processing |
Volume | 21 |
Issue number | 11 |
DOIs | |
Publication status | Published - 2013 |
Externally published | Yes |
Keywords
- Affect
- affective lexicon
- distributional semantic models
- emotion
- lexical semantics
- natural language understanding
- opinion mining
- polarity detection
- sentiment analysis
- valence