A BAYESIAN-APPROACH TO RELEVANCE IN GAME PLAYING

Authors
Citation
Eb. Baum et Wd. Smith, A BAYESIAN-APPROACH TO RELEVANCE IN GAME PLAYING, Artificial intelligence, 97(1-2), 1997, pp. 195-242
Citations number
48
Journal title
ISSN journal
00043702
Volume
97
Issue
1-2
Year of publication
1997
Pages
195 - 242
Database
ISI
SICI code
0004-3702(1997)97:1-2<195:ABTRIG>2.0.ZU;2-O
Abstract
The point of game tree search is to insulate oneself from errors in th e evaluation function. The standard approach is to grow a full width t ree as deep as time allows, and then value the tree as if the leaf eva luations were exact. The alpha-beta algorithm implements this with gre at computational efficiency. This approach has been effective in many games. Our approach is to form a Bayesian model of our uncertainty. We adopt an evaluation function that returns a probability distribution estimating the probability of various errors in valuing each position. These estimates are obtained by training from data. We thus use addit ional information at each leaf not available to the standard approach. We utilize this information in three ways: to evaluate which move is best after we are done expanding, to allocate additional thinking time to moves where additional time is most relevant to game outcome, and, perhaps most importantly, to expand the tree along the most relevant lines. Our measure of the relevance of expanding a given leaf provably approximates a measure of the impact of expanding the leaf on expecte d payoff; including the impact of the outcome of the leaf expansion on later expansion decisions. Our algorithms run (under reasonable assum ptions) in time linear in the size of the final tree and hence except for a small constant factor, are as time efficient as alpha-beta. Our algorithm focuses on relevant lines, on which it can in principle grow a tree several times as deep as alpha-beta in a given amount of time. We have tested our approach on a variety of games, including Othello, Kalah, Warri, and others. Our probability independence approximations are seen to be significantly violated, but nonetheless our tree valua tion scheme was found to play significantly better than minimax or the Probability Product rule when both competitors search the same tree. Our full search algorithm was found to outplay a highly ranked, direct ly comparable alpha-beta Othello program even when the alpha-beta prog ram was given sizeable time odds, and also performed well against the three top Othello programs on the Internet Othello Server. (C) 1997 El sevier Science B.V.