The SpeDial datasets: Datasets for Spoken Dialogue Systems analytics

José Lopes, Arodami Chorianopoulou, Elisavet Palogiannidi, Helena Moniz, Alberto Abad, Katerina Louka, Elias Iosif, Alexandros Potamianos

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The SpeDial consortium is sharing two datasets that were used during the SpeDial project. By sharing them with the community we are providing a resource to reduce the duration of cycle of development of new Spoken Dialogue Systems (SDSs). The datasets include audios and several manual annotations, i.e., miscommunication, anger, satisfaction, repetition, gender and task success. The datasets were created with data from real users and cover two different languages: English and Greek. Detectors for miscommunication, anger and gender were trained for both systems. The detectors were particularly accurate in tasks where humans have high annotator agreement such as miscommunication and gender. As expected due to the subjectivity of the task, the anger detector had a less satisfactory performance. Nevertheless, we proved that the automatic detection of situations that can lead to problems in SDSs is possible and can be a promising direction to reduce the duration of SDS's development cycle.

Original languageEnglish
Title of host publicationProceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
EditorsNicoletta Calzolari, Khalid Choukri, Helene Mazo, Asuncion Moreno, Thierry Declerck, Sara Goggi, Marko Grobelnik, Jan Odijk, Stelios Piperidis, Bente Maegaard, Joseph Mariani
PublisherEuropean Language Resources Association (ELRA)
Pages104-110
Number of pages7
ISBN (Electronic)9782951740891
Publication statusPublished - 2016
Externally publishedYes
Event10th International Conference on Language Resources and Evaluation, LREC 2016 - Portoroz, Slovenia
Duration: 23 May 201628 May 2016

Publication series

NameProceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016

Conference

Conference10th International Conference on Language Resources and Evaluation, LREC 2016
Country/TerritorySlovenia
CityPortoroz
Period23/05/1628/05/16

Keywords

  • Emotions
  • Multi-lingual data
  • Sentiment analysis
  • Spoken Dialogue Systems

Fingerprint

Dive into the research topics of 'The SpeDial datasets: Datasets for Spoken Dialogue Systems analytics'. Together they form a unique fingerprint.

Cite this