COEVOLUTION IN THE SUCCESSFUL LEARNING OF BACKGAMMON STRATEGY

Citation
Jb. Pollack et Ad. Blair, COEVOLUTION IN THE SUCCESSFUL LEARNING OF BACKGAMMON STRATEGY, Machine learning, 32(3), 1998, pp. 225-240
Citations number
28
Categorie Soggetti
Computer Science Artificial Intelligence","Computer Science Artificial Intelligence
Journal title
ISSN journal
08856125
Volume
32
Issue
3
Year of publication
1998
Pages
225 - 240
Database
ISI
SICI code
0885-6125(1998)32:3<225:CITSLO>2.0.ZU;2-Z
Abstract
Following Tesauro's work on TD-Gammon, we used a 4,000 parameter Feedf orward neural network to develop a competitive backgammon evaluation f unction. Play proceeds by a roll of the dice, application of the netwo rk to all legal moves, and selection of the position with the highest evaluation. However, no backpropagation, reinforcement or temporal dif ference learning methods were employed. Instead we apply simple hillcl imbing in a relative fitness environment. We start with an initial cha mpion of all zero weights and proceed simply by playing the current ch ampion network against a slightly mutated challenger and changing weig hts if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previou sly discarded weak method to succeed, by preventing suboptimal equilib ria in a ''meta-game'' of self-learning.