Jh. Crosbie et Kd. Glazebrook, Index policies and a novel performance space structure for a class of generalised branching bandit problems, MATH OPER R, 25(2), 2000, pp. 281-297
Our object of study is a new class of controlled stochastic systems called
generalised branching bandits which include discounted branching bandits an
d generalised bandit problems as special cases. These models allow us to st
udy queueing scheduling and project scheduling problems in which the reward
earned from processing a particular job is influenced by other job types p
resent in the system. Our mode of analysis is the achievable region approac
h. What is novel here is that the full system may be partitioned into two s
ubsystems, each of which satisfies its own set of conservation laws and has
as own highly structured performance space. An optimal policy for the full
system is constructed from two sets of Gittins indices derived from the co
nservation laws governing the two subsystems.