Cd. Fuh et Ic. Hu, Asymptotically efficient strategies for a stochastic scheduling problem with order constraints, ANN STATIST, 28(6), 2000, pp. 1670-1695
Motivated by an application in computerized adaptive tests, we consider the
following sequential design problem. There are J jobs to be processed acco
rding to a predetermined order. A single machine is available to process th
ese J jobs. Each job under processing evolves stochastically as a Markov ch
ain and earns rewards as it is processed, not otherwise. The Markov chain h
as transition probabilities parameterized by an unknown parameter theta. Th
e objective is to determine how long each job should be processed so that t
he total expected rewards over an extended time interval is maximized. We d
efine the regret associated with a strategy as the shortfall from the maxim
um expected reward under complete information on theta. Therefore the probl
em is equivalent to minimizing the regret. The asymptotic lower bound for t
he regret associated with any uniformly good strategy is characterized by a
deterministic constraint minimization problem. In ignorance of the paramet
er value, we construct a class of efficient strategies, which achieve the l
ower bound, based on the theory of sequential testing.