ITA
ENG

A MODEL APPROXIMATION SCHEME FOR PLANNING IN PARTIALLY OBSERVABLE STOCHASTIC DOMAINS

Authors

ZHANG NL LIU WJ

Citation

Nl. Zhang et Wj. Liu, A MODEL APPROXIMATION SCHEME FOR PLANNING IN PARTIALLY OBSERVABLE STOCHASTIC DOMAINS, The journal of artificial intelligence research, 7, 1997, pp. 199-230

Citations number

Journal title

The journal of artificial intelligence research → ACNP

ISSN journal

10769757

Volume

Year of publication

1997

Pages

199 - 230

Database

ISI

SICI code

1076-9757(1997)7:<199:AMASFP>2.0.ZU;2-C

Abstract

Partially observable Markov decision processes (POMDPs) are a natural model for planning problems where effects of actions are nondeterminis tic and the state of the world is not completely observable. It is dif ficult to solve POMDPs exactly. This paper proposes a new approximatio n scheme. The basic idea is to transform a POMDP into another one wher e additional information is provided by an oracle. The oracle informs the planning agent that the current state of the world is in a certain region. The transformed POMDP is consequently said to be region obser vable. It is easier to solve than the original POMDP. We propose to so lve the transformed POMDP and use its optimal policy to construct an a pproximate policy for the original POMDP. By controlling the amount of additional information that the oracle provides, it is possible to fi nd a proper tradeoff between computational time and approximation qual ity. In terms of algorithmic contributions, we study in details how to exploit region observability in solving the transformed POMDP. To fac ilitate the study, we also propose a new exact algorithm for general P OMDPs. The algorithm is conceptually simple and yet is significantly m ore efficient than all previous exact algorithms.