ITA
ENG

Index-based policies for discounted multi-armed bandits on parallel machines

Authors

Glazebrook, KD Wilkinson, DJ

Citation

Kd. Glazebrook et Dj. Wilkinson, Index-based policies for discounted multi-armed bandits on parallel machines, ANN APPL PR, 10(3), 2000, pp. 877-896

Citations number

Categorie Soggetti

Mathematics

Journal title

ANNALS OF APPLIED PROBABILITY

ISSN journal

10505164 → ACNP

Volume

Issue

Year of publication

2000

Pages

877 - 896

Database

ISI

SICI code

1050-5164(200008)10:3<877:IPFDMB>2.0.ZU;2-A

Abstract

We utilize and develop elements of the recent achievable region account of Gittins indexation by Bertsimas and Nino-Mora to design index-based policie s for discounted multi-armed bandits on parallel machines. The policies ana lyzed have expected rewards which come within an O(alpha) quantity of optim ality, where alpha > 0 is a discount rate. In the main, the policies make a n initial once for all allocation of bandits to machines, with each machine then handling its own workload optimally. This allocation must take carefu l account of the index structure of the bandits. The corresponding limit po licies are average-overtaking optimal.