We propose for risk sensitive control of finite Markov chains a counterpart
of the popular 'actor-critic' algorithm for classical Markov decision proc
esses. The algorithm is based on a 'sensitivity formula' for the risk sensi
tive cost and is shown to converge with probability one to the desired solu
tion. The proof technique is an adaptation of the ordinary differential equ
ations approach for the analysis of two time-scale stochastic approximation
algorithms. (C) 2001 Elsevier Science B.V. All rights reserved.