A MODEL APPROXIMATION SCHEME FOR PLANNING IN PARTIALLY OBSERVABLE STOCHASTIC DOMAINS

Authors
Citation
Nl. Zhang et Wj. Liu, A MODEL APPROXIMATION SCHEME FOR PLANNING IN PARTIALLY OBSERVABLE STOCHASTIC DOMAINS, The journal of artificial intelligence research, 7, 1997, pp. 199-230
Citations number
30
ISSN journal
10769757
Volume
7
Year of publication
1997
Pages
199 - 230
Database
ISI
SICI code
1076-9757(1997)7:<199:AMASFP>2.0.ZU;2-C
Abstract
Partially observable Markov decision processes (POMDPs) are a natural model for planning problems where effects of actions are nondeterminis tic and the state of the world is not completely observable. It is dif ficult to solve POMDPs exactly. This paper proposes a new approximatio n scheme. The basic idea is to transform a POMDP into another one wher e additional information is provided by an oracle. The oracle informs the planning agent that the current state of the world is in a certain region. The transformed POMDP is consequently said to be region obser vable. It is easier to solve than the original POMDP. We propose to so lve the transformed POMDP and use its optimal policy to construct an a pproximate policy for the original POMDP. By controlling the amount of additional information that the oracle provides, it is possible to fi nd a proper tradeoff between computational time and approximation qual ity. In terms of algorithmic contributions, we study in details how to exploit region observability in solving the transformed POMDP. To fac ilitate the study, we also propose a new exact algorithm for general P OMDPs. The algorithm is conceptually simple and yet is significantly m ore efficient than all previous exact algorithms.