Minimizing risk models in Markov decision processes with policies depending on target values

Authors
Citation
Cb. Wu et Yl. Lin, Minimizing risk models in Markov decision processes with policies depending on target values, J MATH ANAL, 231(1), 1999, pp. 47-67
Citations number
7
Categorie Soggetti
Mathematics
Journal title
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS
ISSN journal
0022247X → ACNP
Volume
231
Issue
1
Year of publication
1999
Pages
47 - 67
Database
ISI
SICI code
0022-247X(19990301)231:1<47:MRMIMD>2.0.ZU;2-9
Abstract
This paper studies the minimizing risk problems in Markov decision processe s with countable state space and reward set. The objective is to find a pol icy which minimizes the probability (risk) that the total discounted reward s do not exceed a specified value (target). In this sort of model, the deci sion made by the decision maker depends not only on system's states, but al so on his target values. By introducing the decision-maker's state, we form ulate a framework for minimizing risk models. The policies discussed depend on target values and the rewards may be arbitrary real numbers. For the fi nite horizon model, the main results obtained are: (i) The optimal value fu nctions are distribution functions of the target, (ii) there exists an opti mal deterministic Markov policy, and (iii) a policy is optimal if and only if at each realizable state it always takes optimal action. In addition, we obtain a sufficient condition and a necessary condition for the existence of finite horizon optimal policy independent of targets and we give an algo rithm computing finite horizon optimal policies and optimal value functions . For an infinite horizon model, we establish the optimality equation and w e obtain the structure property of optimal policy. We prove that the optima l value function is a distribution function of target and we present a new approximation formula which is the generalization of the nonnegative reward s cases. An example which illustrates the mistakes of previous literature s hows that the existence of optimal policy has not been proved really. In th is paper, we give an existence condition, which is a sufficient and necessa ry condition for the existence of an infinite horizon optimal policy indepe ndent of targets, and we point out that whether there exists an optimal pol icy remains an open problem in the general case. (C) 1999 Academic Press.