Abstract
Using different sources of information for grammar induction results in grammars that vary in coverage and precision. Fusing such grammars with a strategy that exploits their strengths while minimizing their weaknesses is expected to produce grammars with superior performance. We focus on the fusion of grammars produced using a knowledge-based approach using lexicalized ontologies and a data-driven approach using semantic similarity clustering. We propose various algorithms for finding the mapping between the (non-terminal) rules generated by each grammar induction algorithm, followed by rule fusion. Three fusion approaches are investigated: early, mid and late fusion. Results show that late fusion provides the best relative F-measure performance improvement by 20%.
Original language | English |
---|---|
Pages (from-to) | 288-292 |
Number of pages | 5 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Publication status | Published - 2014 |
Externally published | Yes |
Event | 15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore Duration: 14 Sept 2014 → 18 Sept 2014 |
Keywords
- Corpus-based grammar induction
- Grammar fusion
- Ontology-based grammar induction
- Spoken dialogue systems