ITA
ENG

ASYMPTOTICALLY EFFICIENT ADAPTIVE CHOICE OF CONTROL LAWS IN CONTROLLED MARKOV-CHAINS

Authors

GRAVES TL LAI TL

Citation

Tl. Graves et Tl. Lai, ASYMPTOTICALLY EFFICIENT ADAPTIVE CHOICE OF CONTROL LAWS IN CONTROLLED MARKOV-CHAINS, SIAM journal on control and optimization, 35(3), 1997, pp. 715-743

Citations number

Categorie Soggetti

Controlo Theory & Cybernetics",Mathematics

Journal title

SIAM journal on control and optimization → ACNP

ISSN journal

03630129

Volume

Issue

Year of publication

1997

Pages

715 - 743

Database

ISI

SICI code

0363-0129(1997)35:3<715:AEACOC>2.0.ZU;2-#

Abstract

We consider a controlled Markov chain on a general state space whose t ransition probabilities are parameterized by an unknown parameter belo nging to a compact metric space. There is a one-step reward associated with each pair of control and the following state of the process. Giv en a finite set of stationary control laws, under each of which the Ma rkov chain is uniformly recurrent, an optimal control law in this set is one that maximizes the long-run average reward. In ignorance of the parameter value, we construct an adaptive control rule which uses the optimal control law(s) at a relative frequency of 1-O(n(-1) log n) an d show that this relative frequency gives an asymptotically optimal ba lance between the control objective and the amount of information need ed to learn about the unknown parameter. The basic idea underlying thi s construction is to introduce suitable ''uncertainty adjustments'' vi a sequential testing theory into the certainty-equivalence rule, thus resolving the apparent dilemma between control and information.