A sensitivity formula for risk-sensitive cost and the actor-critic algorithm

Authors
Citation
Vs. Borkar, A sensitivity formula for risk-sensitive cost and the actor-critic algorithm, SYST CONTR, 44(5), 2001, pp. 339-346
Citations number
25
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
SYSTEMS & CONTROL LETTERS
ISSN journal
01676911 → ACNP
Volume
44
Issue
5
Year of publication
2001
Pages
339 - 346
Database
ISI
SICI code
0167-6911(200112)44:5<339:ASFFRC>2.0.ZU;2-T
Abstract
We propose for risk sensitive control of finite Markov chains a counterpart of the popular 'actor-critic' algorithm for classical Markov decision proc esses. The algorithm is based on a 'sensitivity formula' for the risk sensi tive cost and is shown to converge with probability one to the desired solu tion. The proof technique is an adaptation of the ordinary differential equ ations approach for the analysis of two time-scale stochastic approximation algorithms. (C) 2001 Elsevier Science B.V. All rights reserved.