The Philips automatic telephone switchboard and directory information
system PADIS provides a natural-language user interface to a telephone
directory database. Using speech recognition and language understandi
ng technologies, the system offers phone numbers, fax numbers, e-mail
addresses, and room numbers as well as direct call completion to a des
ired party. In this paper, we present the underlying probabilistic fra
mework, the system architecture, and the individual modules for speech
recognition, language understanding, dialogue control, and speech out
put. In addition, we report results on performance and user behaviour
obtained from a field test in our research lab with a 600-entry databa
se. We derive a new maximum-a-posteriori decision rule which incorpora
tes database knowledge and dialogue history as constraints in speech r
ecognition and language understanding. It has improved speech understa
nding accuracy by 19% (in terms of concept error rate), and reduced at
tribute substitution errors (e.g. recognition of a wrong name) by 38%.
The decision rule is implemented in a multi-stage approach as a combi
nation of state-of-the-art speech recognition, partial parsing with an
attributed stochastic context-free grammar, and an N-best algorithm w
hich is also described in this paper. The system conducts a flexible m
ixed-initiative dialogue rather than using a rigid form-filling scheme
, and incorporates database knowledge to optimize the dialogue flow. (
C) 1997 Elsevier Science B.V.