ON THE COMPLEXITY OF PARTIALLY OBSERVED MARKOV DECISION-PROCESSES

Citation
D. Burago et al., ON THE COMPLEXITY OF PARTIALLY OBSERVED MARKOV DECISION-PROCESSES, Theoretical computer science, 157(2), 1996, pp. 161-183
Citations number
19
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
ISSN journal
03043975
Volume
157
Issue
2
Year of publication
1996
Pages
161 - 183
Database
ISI
SICI code
0304-3975(1996)157:2<161:OTCOPO>2.0.ZU;2-X
Abstract
In the paper we consider the complexity of constructing optimal polici es (strategies) fur some type of partially observed Markov decision pr ocesses. This particular case of the classical problem deals with fini te stationary processes, and can be represented as constructing optima l stratgies to reach target vertices from a starting vertex in a graph with colored vertices and probabilistic deviations from an edge chose n to follow, The colors of the visited vertices is the only informatio n available to a strategy. The complexity of Markov decision in the ca se of perfect information (bijective coloring of vertices) is known an d briefly surveyed at the beginning of the paper. For the unobservable case (all the colors are equal) we give an improvement of the result of Papadimitriou and Tsitsiklis, namely we show that the problem of co nstructing even a very weak approximation to an optimal strategy is NP -hard. Our main results concern the case of a fixed bound on the multi plicity of coloring, that is a case of partially observed processes wh ere some upper bound on the unobservability is supposed, We show that the problem of finding an optimal strategy is still NP-hard, but polyt ime approximations are possible. Some relations of our results to the Max-Word Problem are also indicated.