This paper presents a probabilistic framework for modelling spoken-dialogue
systems. On the assumption that the overall system behaviour can be repres
ented as a Markov decision process, the optimization of dialogue-management
strategy using reinforcement learning is reviewed. Examples of learning be
haviour are presented for both dynamic programming and sampling methods, bu
t the latter are preferred. The paper concludes by noting the importance of
user simulation models for the practical application of these techniques a
nd the need for developing methods of mapping system features in order to a
chieve sufficiently compact state spaces.