Dth. Ng et al., ADAPTIVE LEARNING-MECHANISMS FOR ORDERING ACTIONS USING RANDOM RACES, IEEE transactions on systems, man, and cybernetics, 23(5), 1993, pp. 1450-1465
Citations number
23
Categorie Soggetti
Controlo Theory & Cybernetics","Computer Science Cybernetics","Engineering, Eletrical & Electronic
Consider a learning machine (LM) interacting with an environment epsil
on. The environment offers the machine M actions. Traditionally, learn
ing systems endeavor to compute the best action that the environment o
ffers, and this is done without any estimation procedure. In this pape
r, we consider the problem of the LM computing not only the optimal ac
tion offered but also the ordering of the actions in terms of their op
timality. The problem is posed in its generality and various norms of
learning in this setting are formalized. Also various learning strateg
ies are presented that use a new mathematical model called the random
race. In this model the learning is modeled using M racers that are ru
nning toward a goal. At each instant, racer R(i) moves toward the goal
with a probability of s(i) and stays where he is with a probability o
f (1 - s(i)). In the simplest learning model, the learning multiple ra
ce track (LMRT) model, the racers run on multiple tracks, and in this
scenario, each racer has his own track, thus disallowing interference
between the racers. However, in a more general setting, the learning s
ingle race track (LSRT) model, the racers run on a single track, and i
n this case, interferences between racers are specified in terms of ov
ertaking rules. In this paper, we first examine the learning multiple
race track (LMRT) model, and we have shown that in the absence of a pr
iori information the LMRT is permutationally epsilon-optimal in all su
ggestive random environments. Subsequently, a variety of results are p
roven for the cases when the LMRT uses the a priori information by giv
ing the racers handicaps that are either uniformly or geometrically di
stributed. Analogous results for the learning single race track (LSRT)
model are also conjectured and these conjectures are strengthened by
numerous simulations.