ITA
ENG

On maximal rewards and |varepsilon-optimal policies in continuous time markov decision chains

Authors

Lembersky, Mark R.

Citation

R. Lembersky, Mark, On maximal rewards and |varepsilon-optimal policies in continuous time markov decision chains, Annals of statistics , 2(1), 1974, pp. 159-169

Journal title

Annals of statistics → ACNP

ISSN journal

00905364

Volume

Issue

Year of publication

1974

Pages

159 - 169

Database

ACNP

SICI code

Abstract

For continuous time Markov decision chains of finite duration, we show that the vector of maximal total rewards, less a linear average-return term, converges as the duration t . ..We then show that there are policies which are both simultaneously .-optimal for all durations t and are stationary except possibly for a final, finite segment.Further, the length of this final segment depends on ., but not on t for large enough t, while the initial stationary part of the policy is independent of both . and t.