A simple learning strategy that realizes robust cooperation better than Pavlov in Iterated Prisoners' Dilemma

Citation
Jy. Wakano et N. Yamamura, A simple learning strategy that realizes robust cooperation better than Pavlov in Iterated Prisoners' Dilemma, J ETHOL, 19(1), 2001, pp. 1-8
Citations number
27
Categorie Soggetti
Animal Sciences
Journal title
JOURNAL OF ETHOLOGY
ISSN journal
02890771 → ACNP
Volume
19
Issue
1
Year of publication
2001
Pages
1 - 8
Database
ISI
SICI code
0289-0771(2001)19:1<1:ASLSTR>2.0.ZU;2-0
Abstract
Pavlov was proposed as a leading strategy for realizing cooperation because it dominates over a long period in evolutionary computer simulations of th e Iterated Prisoners' Dilemma. However. our numerical calculations reveal t hat Pavlov and also any other cooperative strategy are not evolutionarily s table among all stochastic strategies with memory of only one previous move . We propose simple learning based on reinforcement. The learner changes it s internal state, depending on an evaluation of whether the score in the pr evious round is larger than a critical value (aspiration level), which is g enetically fixed. The current internal state decides the learner's move, bu t we found that the aspiration level determines its final behavior. The coo perative variant, having an intermediate aspiration level, is not an evolut ionarily stable strategy (ESS) when evaluation is binary (good or bad). How ever, when the evaluation is quantified some cooperative variants can invad e not only AII-C, Tit-For-Tat (TFT), and Pavlov but also noncooperative var iants with different aspiration levels. Moreover, they establish robust coo peration, which is evolutionarily stable against invasion by Al1-C, Al1-D, TFT, Pavlov, and noncooperative variants, and they receive a high score eve n when the error rate is high. Our results suggest that mutual cooperation can be maintained when players have a primitive learning ability.