ITA
ENG

OPTIMALITY OF MONOTONIC POLICIES FOR 2-ACTION MARKOVIAN DECISION-PROCESSES, WITH APPLICATIONS TO CONTROL OF QUEUES WITH DELAYED INFORMATION

Authors

ALTMAN E STIDHAM S

Citation

E. Altman et S. Stidham, OPTIMALITY OF MONOTONIC POLICIES FOR 2-ACTION MARKOVIAN DECISION-PROCESSES, WITH APPLICATIONS TO CONTROL OF QUEUES WITH DELAYED INFORMATION, Queuing systems, 21(3-4), 1995, pp. 267-291

Citations number

Categorie Soggetti

Operatione Research & Management Science","Computer Science Interdisciplinary Applications

Journal title

Queuing systems → ACNP

ISSN journal

02570130

Volume

Issue

3-4

Year of publication

1995

Pages

267 - 291

Database

ISI

SICI code

0257-0130(1995)21:3-4<267:OOMPF2>2.0.ZU;2-F

Abstract

We consider a discrete-time Markov decision process with a partially o rdered state space and two feasible control actions in each state. Our goal is to find general conditions, which are satisfied in a broad cl ass of applications to control of queues, under which an optimal contr ol policy is monotonic. An advantage of our approach is that it easily extends to problems with both information and action delays, which ar e common in applications to high-speed communication networks, among o thers. The transition probabilities are stochastically monotone and th e one-stage reward submodular. We further assume that transitions from different states are coupled, in the sense that the state after a tra nsition is distributed as a deterministic function of the current stat e and two random variables, one of which is controllable and the other uncontrollable. Finally, we make a monotonicity assumption about the sample-path effect of a pairwise switch of the actions in consecutive stages. Using induction on the horizon length, we demonstrate that opt imal policies for the finite- and infinite-horizon discounted problems are monotonic. We apply these results to a single queueing facility w ith control of arrivals and/or services, under very general conditions . In this case, our results imply that an optimal control policy has t hreshold form. Finally, we show how monotonicity of an optimal policy extends in a natural way to problems with information and/or action de lay, including delays of more than one time unit. Specifically, we sho w that, if a problem without delay satisfies our sufficient conditions for monotonicity of an optimal policy, then the same problem with inf ormation and/or action delay also has monotonic (e.g., threshold) opti mal policies.