A. Grosfeldnir, A 2-STATE PARTIALLY OBSERVABLE MARKOV DECISION-PROCESS WITH UNIFORMLYDISTRIBUTED OBSERVATIONS, Operations research, 44(3), 1996, pp. 458-463
Citations number
14
Categorie Soggetti
Management,"Operatione Research & Management Science","Operatione Research & Management Science
A controller observes a production system periodically, over time. If
the system is in the GOOD state during one period, there is a constant
probability that it will deteriorate and be in the BAD state during t
he next period (and remains there). The true state of the system is un
observable and on only be inferred from observations (quality of outpu
t). Two actions are available: CONTINUE or REPLACE (for a fixed cost).
The objective is to maximize the expected discounted value of the tot
al future income. For both the finite- and infinite-horizon problems,
the optimal policy is of a CONTROL LIMIT (CLT) type: continue if the g
ood state probability exceeds the CLT, and replace otherwise. The comp
utation of the CLT involves a functional equation. An analytical solut
ion for this equation is as yet unknown. For uniformly distributed obs
ervations we obtain the infinite-horizon CLT analytically. We also sho
w that the finite horizon CLTs, as a function of the time remaining, a
re not necessarily monotone, which is counterintuitive.