Nl. Zhang et Wj. Liu, A MODEL APPROXIMATION SCHEME FOR PLANNING IN PARTIALLY OBSERVABLE STOCHASTIC DOMAINS, The journal of artificial intelligence research, 7, 1997, pp. 199-230
Partially observable Markov decision processes (POMDPs) are a natural
model for planning problems where effects of actions are nondeterminis
tic and the state of the world is not completely observable. It is dif
ficult to solve POMDPs exactly. This paper proposes a new approximatio
n scheme. The basic idea is to transform a POMDP into another one wher
e additional information is provided by an oracle. The oracle informs
the planning agent that the current state of the world is in a certain
region. The transformed POMDP is consequently said to be region obser
vable. It is easier to solve than the original POMDP. We propose to so
lve the transformed POMDP and use its optimal policy to construct an a
pproximate policy for the original POMDP. By controlling the amount of
additional information that the oracle provides, it is possible to fi
nd a proper tradeoff between computational time and approximation qual
ity. In terms of algorithmic contributions, we study in details how to
exploit region observability in solving the transformed POMDP. To fac
ilitate the study, we also propose a new exact algorithm for general P
OMDPs. The algorithm is conceptually simple and yet is significantly m
ore efficient than all previous exact algorithms.