Similarity computation using semantic networks created from web-harvested data

Elias Iosif, Alexandros Potamianos

Research output: Contribution to journalArticlepeer-review

Abstract

We investigate language-agnostic algorithms for the construction of unsupervised distributional semantic models using web-harvested corpora. Specifically, a corpus is created from web document snippets, and the relevant semantic similarity statistics are encoded in a semantic network. We propose the notion of semantic neighborhoods that are defined using co-occurrence or context similarity features. Three neighborhood-based similarity metrics are proposed, motivated by the hypotheses of attributional and maximum sense similarity. The proposed metrics are evaluated against human similarity ratings achieving state-of-the-art results.

Original languageEnglish
Pages (from-to)49-79
Number of pages31
JournalNatural Language Engineering
Volume21
Issue number1
DOIs
Publication statusPublished - 23 Jan 2015
Externally publishedYes

Fingerprint

Dive into the research topics of 'Similarity computation using semantic networks created from web-harvested data'. Together they form a unique fingerprint.

Cite this