A 2-STATE PARTIALLY OBSERVABLE MARKOV DECISION-PROCESS WITH UNIFORMLYDISTRIBUTED OBSERVATIONS

Authors
Citation
A. Grosfeldnir, A 2-STATE PARTIALLY OBSERVABLE MARKOV DECISION-PROCESS WITH UNIFORMLYDISTRIBUTED OBSERVATIONS, Operations research, 44(3), 1996, pp. 458-463
Citations number
14
Categorie Soggetti
Management,"Operatione Research & Management Science","Operatione Research & Management Science
Journal title
ISSN journal
0030364X
Volume
44
Issue
3
Year of publication
1996
Pages
458 - 463
Database
ISI
SICI code
0030-364X(1996)44:3<458:A2POMD>2.0.ZU;2-N
Abstract
A controller observes a production system periodically, over time. If the system is in the GOOD state during one period, there is a constant probability that it will deteriorate and be in the BAD state during t he next period (and remains there). The true state of the system is un observable and on only be inferred from observations (quality of outpu t). Two actions are available: CONTINUE or REPLACE (for a fixed cost). The objective is to maximize the expected discounted value of the tot al future income. For both the finite- and infinite-horizon problems, the optimal policy is of a CONTROL LIMIT (CLT) type: continue if the g ood state probability exceeds the CLT, and replace otherwise. The comp utation of the CLT involves a functional equation. An analytical solut ion for this equation is as yet unknown. For uniformly distributed obs ervations we obtain the infinite-horizon CLT analytically. We also sho w that the finite horizon CLTs, as a function of the time remaining, a re not necessarily monotone, which is counterintuitive.