Unsupervised combination of metrics for semantic class induction

Elias Iosif, Athanasios Tegos, Apostolos Pangos, Eric Fosler-Lussier, Alexandros Potamianos

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, unsupervised algorithms for combining semantic similarity metrics are proposed for the problem of automatic class induction. The automatic class induction algorithm is based on the work of Pargellis et al [1]. The semantic similarity metrics that are evaluated and combined are based on narrow- and wide-context vector-product similarity. The metrics are combined using linear weights that are computed 'on the fly' and are updated at each iteration of the class induction algorithm, forming a corpus-independent metric. Specifically, the weight of each metric is selected to be inversely proportional to the inter-class similarity of the classes induced by that metric and for the current iteration of the algorithm. The proposed algorithms are evaluated on two corpora: a semantically heterogeneous news domain (HR-Net) and an application-specific travel reservation corpus (ATIS). It is shown, that the (unsupervised) adaptive weighting scheme outperforms the (supervised) fixed weighting scheme. Up to 50% relative error reduction is achieved by the adaptive weighting scheme.

Original languageEnglish
Title of host publication2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006, Proceedings
Pages86-89
Number of pages4
DOIs
Publication statusPublished - 2006
Externally publishedYes
Event2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006 - Palm Beach, Aruba
Duration: 10 Dec 200613 Dec 2006

Publication series

Name2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006, Proceedings

Conference

Conference2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006
Country/TerritoryAruba
CityPalm Beach
Period10/12/0613/12/06

Keywords

  • Information retrieval
  • Ontology creation
  • Text processing

Fingerprint

Dive into the research topics of 'Unsupervised combination of metrics for semantic class induction'. Together they form a unique fingerprint.

Cite this