ITA
ENG

Simulation-based optimization of Markov reward processes

Authors

Marbach, P Tsitsiklis, JN

Citation

P. Marbach et Jn. Tsitsiklis, Simulation-based optimization of Markov reward processes, IEEE AUTO C, 46(2), 2001, pp. 191-209

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN journal

00189286 → ACNP

Volume

Issue

Year of publication

2001

Pages

191 - 209

Database

ISI

SICI code

0018-9286(200102)46:2<191:SOOMRP>2.0.ZU;2-#

Abstract

This paper proposes a simulation-based algorithm for optimizing the average reward in a finite-state Markov reward process that depends on a set of pa rameters. As a special case, the method applies to Markov decision processe s where optimization takes place within a parametrized set of policies. The algorithm relies on the regenerative structure of finite-state Markov proc esses, involves the simulation of a single sample path, and can be implemen ted online. A convergence result (with probability 1) is provided.