Jy. Wakano et N. Yamamura, A simple learning strategy that realizes robust cooperation better than Pavlov in Iterated Prisoners' Dilemma, J ETHOL, 19(1), 2001, pp. 1-8
Pavlov was proposed as a leading strategy for realizing cooperation because
it dominates over a long period in evolutionary computer simulations of th
e Iterated Prisoners' Dilemma. However. our numerical calculations reveal t
hat Pavlov and also any other cooperative strategy are not evolutionarily s
table among all stochastic strategies with memory of only one previous move
. We propose simple learning based on reinforcement. The learner changes it
s internal state, depending on an evaluation of whether the score in the pr
evious round is larger than a critical value (aspiration level), which is g
enetically fixed. The current internal state decides the learner's move, bu
t we found that the aspiration level determines its final behavior. The coo
perative variant, having an intermediate aspiration level, is not an evolut
ionarily stable strategy (ESS) when evaluation is binary (good or bad). How
ever, when the evaluation is quantified some cooperative variants can invad
e not only AII-C, Tit-For-Tat (TFT), and Pavlov but also noncooperative var
iants with different aspiration levels. Moreover, they establish robust coo
peration, which is evolutionarily stable against invasion by Al1-C, Al1-D,
TFT, Pavlov, and noncooperative variants, and they receive a high score eve
n when the error rate is high. Our results suggest that mutual cooperation
can be maintained when players have a primitive learning ability.