In the paper we consider the complexity of constructing optimal polici
es (strategies) fur some type of partially observed Markov decision pr
ocesses. This particular case of the classical problem deals with fini
te stationary processes, and can be represented as constructing optima
l stratgies to reach target vertices from a starting vertex in a graph
with colored vertices and probabilistic deviations from an edge chose
n to follow, The colors of the visited vertices is the only informatio
n available to a strategy. The complexity of Markov decision in the ca
se of perfect information (bijective coloring of vertices) is known an
d briefly surveyed at the beginning of the paper. For the unobservable
case (all the colors are equal) we give an improvement of the result
of Papadimitriou and Tsitsiklis, namely we show that the problem of co
nstructing even a very weak approximation to an optimal strategy is NP
-hard. Our main results concern the case of a fixed bound on the multi
plicity of coloring, that is a case of partially observed processes wh
ere some upper bound on the unobservability is supposed, We show that
the problem of finding an optimal strategy is still NP-hard, but polyt
ime approximations are possible. Some relations of our results to the
Max-Word Problem are also indicated.