The thoughtful elephant: Strategies for spoken dialog systems

Citation
B. Souvignier et al., The thoughtful elephant: Strategies for spoken dialog systems, IEEE SPEECH, 8(1), 2000, pp. 51-62
Citations number
40
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
ISSN journal
10636676 → ACNP
Volume
8
Issue
1
Year of publication
2000
Pages
51 - 62
Database
ISI
SICI code
1063-6676(200001)8:1<51:TTESFS>2.0.ZU;2-3
Abstract
In this paper we present technology used in spoken dialog systems for appli cations of a wide range. They include tasks from the travel domain and auto matic switchboards as well as large scale directory assistance. The overall goal in developing spoken dialog systems is to allow for a natural and fle xible dialog flow similar to human-human interaction. This imposes the chal lenging task to recognize and interpret user input, where he/she is allowed to choose from an unrestricted vocabulary and an infinite set of possible formulations. We therefore put emphasis on strategies that make the system more robust while still maintaining a high level of naturalness and flexibi lity. In view of this paradigm, we found that two fundamental principles ch aracterize many of the proposed methods: 1) to consider available sources o f information as early as possible, and 2) to keep alternative hypotheses a nd delay the decision for a single option as long as possible. We describe how our system architecture eaters to incorporating application specific knowledge, including, for example, database constraints, in the d etermination of the best sentence hypothesis for a user turn. On the next h igher level, we use the dialog history to assess the plausibility of a sent ence hypothesis by applying consistency checks with information items from previous user turns. In particular, we demonstrate how combination decision s over several turns can be exploited to boost the recognition performance of the system. The dialog manager can also use information on the dialog fl ow to dynamically modify and tune the system for the specific dialog situat ions. An important means to increase the "intelligence" of a spoken dialog system is to use confidence measures. We propose methods to obtain confiden ce measures for semantic items, whole sentences and even full N-best lists and give examples for the benefits obtained from their application. Experie nces from field tests with our systems are summarized that have been found crucial for the system acceptance.