An agent that must learn to act in the world by trial and error faces
the reinforcement learning problem, which is quite different from stan
dard concept learning. Although good algorithms exist for this problem
in the general case, they are often quite inefficient and do not exhi
bit generalization. One strategy is to find restricted classes of acti
on policies that can be learned more efficiently. This paper pursues t
hat strategy by developing an algorithm that performans an on-line sea
rch through the space of action mappings, expressed as Boolean formula
e. The algorithm is compared with existing methods in empirical trials
and is shown to have very good performance.