Penalty function and adaptive control of constrained finite Markov chains

Citation
K. Najim et As. Poznyak, Penalty function and adaptive control of constrained finite Markov chains, INT J ADAPT, 12(7), 1998, pp. 545-565
Citations number
35
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING
ISSN journal
08906327 → ACNP
Volume
12
Issue
7
Year of publication
1998
Pages
545 - 565
Database
ISI
SICI code
0890-6327(199811)12:7<545:PFAACO>2.0.ZU;2-1
Abstract
In this paper we consider the adaptive control of constrained finite ergodi c controller Markov chains whose transition probabilities are unknown. The control policy is designed to achieve the minimization of a loss function u nder a set of inequality constraints. The average values of conditional mat hematical expectations of this loss function and constraints are also assum ed to be unknown. A regularized penalty function is introduced to derive an adaptive control algorithm. In this algorithm the transition probabilities of the Markov chain and the average values of the constraints are estimate d at each time n. The control policy is adjusted using the Bush-Mosteller r einforcement scheme as a stochastic approximation procedure. Its asymptotic properties are stated. We establish that the optimal convergence rate is e qual to n(-1/3+delta) (delta is any small positive parameter), (C) 1998 Joh n Wiley & Sons, Ltd.