Individuals in a finite population repeatedly choose among actions yie
lding uncertain payoffs. Between choices, each individual observes the
action and realized outcome of one other individual. We restrict our
search to learning rules with limited memory that increase expected pa
yoffs regardless of the distribution underlying their realizations. It
is shown that the rule that outperforms all others is that which imit
ates the action of an observed individual (whose realized outcome is b
etter than self) with a probability proportional to the difference in
these realizations. When each individual uses this best rule, the aggr
egate population behavior is approximated by the replicator dynamic. (
C) 1998 Academic Press.