PERCENTILE PERFORMANCE CRITERIA FOR LIMITING AVERAGE MARKOV DECISION-PROCESSES

Citation
Ja. Filar et al., PERCENTILE PERFORMANCE CRITERIA FOR LIMITING AVERAGE MARKOV DECISION-PROCESSES, IEEE transactions on automatic control, 40(1), 1995, pp. 2-10
Citations number
16
Categorie Soggetti
Controlo Theory & Cybernetics","Robotics & Automatic Control","Engineering, Eletrical & Electronic
ISSN journal
00189286
Volume
40
Issue
1
Year of publication
1995
Pages
2 - 10
Database
ISI
SICI code
0018-9286(1995)40:1<2:PPCFLA>2.0.ZU;2-2
Abstract
In this paper we address the following basic feasibility problem for i nfinite-horizon Markov decision processes (MDP's): can a policy be fou nd that achieves a specified value (target) of the long-run limiting a verage reward at a specified probability level (percentile)? Related o ptimization problems of maximizing the target for a specified percenti le and vice versa are also considered. We present a complete (and disc rete) classification of both the maximal achievable target levels and of their corresponding percentiles. We also provide an algorithm for c omputing a deterministic policy corresponding to any feasible target-p ercentile pair. Next we consider similar problems for an MDP with mult iple rewards and/or constraints. This case presents some difficulties and leads to several open problems. An LP-based formulation provides c onstructive solutions for most cases.