TY - GEN
T1 - Unsupervised combination of metrics for semantic class induction
AU - Iosif, Elias
AU - Tegos, Athanasios
AU - Pangos, Apostolos
AU - Fosler-Lussier, Eric
AU - Potamianos, Alexandros
PY - 2006
Y1 - 2006
N2 - In this paper, unsupervised algorithms for combining semantic similarity metrics are proposed for the problem of automatic class induction. The automatic class induction algorithm is based on the work of Pargellis et al [1]. The semantic similarity metrics that are evaluated and combined are based on narrow- and wide-context vector-product similarity. The metrics are combined using linear weights that are computed 'on the fly' and are updated at each iteration of the class induction algorithm, forming a corpus-independent metric. Specifically, the weight of each metric is selected to be inversely proportional to the inter-class similarity of the classes induced by that metric and for the current iteration of the algorithm. The proposed algorithms are evaluated on two corpora: a semantically heterogeneous news domain (HR-Net) and an application-specific travel reservation corpus (ATIS). It is shown, that the (unsupervised) adaptive weighting scheme outperforms the (supervised) fixed weighting scheme. Up to 50% relative error reduction is achieved by the adaptive weighting scheme.
AB - In this paper, unsupervised algorithms for combining semantic similarity metrics are proposed for the problem of automatic class induction. The automatic class induction algorithm is based on the work of Pargellis et al [1]. The semantic similarity metrics that are evaluated and combined are based on narrow- and wide-context vector-product similarity. The metrics are combined using linear weights that are computed 'on the fly' and are updated at each iteration of the class induction algorithm, forming a corpus-independent metric. Specifically, the weight of each metric is selected to be inversely proportional to the inter-class similarity of the classes induced by that metric and for the current iteration of the algorithm. The proposed algorithms are evaluated on two corpora: a semantically heterogeneous news domain (HR-Net) and an application-specific travel reservation corpus (ATIS). It is shown, that the (unsupervised) adaptive weighting scheme outperforms the (supervised) fixed weighting scheme. Up to 50% relative error reduction is achieved by the adaptive weighting scheme.
KW - Information retrieval
KW - Ontology creation
KW - Text processing
UR - http://www.scopus.com/inward/record.url?scp=48749123007&partnerID=8YFLogxK
U2 - 10.1109/SLT.2006.326823
DO - 10.1109/SLT.2006.326823
M3 - Conference contribution
AN - SCOPUS:48749123007
SN - 1424408733
SN - 9781424408733
T3 - 2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006, Proceedings
SP - 86
EP - 89
BT - 2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006, Proceedings
T2 - 2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006
Y2 - 10 December 2006 through 13 December 2006
ER -