Indices for Families of Competing Markov Decision Processes with Influence

Citation
D. Glazebrook, K., Indices for Families of Competing Markov Decision Processes with Influence, Annals of applied probability , 3(4), 1993, pp. 1013-1032
ISSN journal
10505164
Volume
3
Issue
4
Year of publication
1993
Pages
1013 - 1032
Database
ACNP
SICI code
Abstract
Nash obtained an important extension to the classical theory of Gittins indexation when he demonstrated that index policies were optimal for a class of multiarmed bandit problems with a multiplicatively separable reward structure. We characterise the relevant indices (herein referred to as Nash indices) as equivalent retirement rewards/penalties for appropriately defined maximisation/minimisation problems. We also give a condition which is sufficient to guarantee the optimality of index policies for a Nash-type model in which each constituent bandit has its own decision structure.