Ja. Filar et al., PERCENTILE PERFORMANCE CRITERIA FOR LIMITING AVERAGE MARKOV DECISION-PROCESSES, IEEE transactions on automatic control, 40(1), 1995, pp. 2-10
Citations number
16
Categorie Soggetti
Controlo Theory & Cybernetics","Robotics & Automatic Control","Engineering, Eletrical & Electronic
In this paper we address the following basic feasibility problem for i
nfinite-horizon Markov decision processes (MDP's): can a policy be fou
nd that achieves a specified value (target) of the long-run limiting a
verage reward at a specified probability level (percentile)? Related o
ptimization problems of maximizing the target for a specified percenti
le and vice versa are also considered. We present a complete (and disc
rete) classification of both the maximal achievable target levels and
of their corresponding percentiles. We also provide an algorithm for c
omputing a deterministic policy corresponding to any feasible target-p
ercentile pair. Next we consider similar problems for an MDP with mult
iple rewards and/or constraints. This case presents some difficulties
and leads to several open problems. An LP-based formulation provides c
onstructive solutions for most cases.