DISCOUNTED REINFORCEMENT LEARNING DOES NOT SCALE

Citation
Ma. Mcdonald et P. Hingston, DISCOUNTED REINFORCEMENT LEARNING DOES NOT SCALE, Computational intelligence, 13(1), 1997, pp. 126-143
Citations number
41
Categorie Soggetti
Computer Sciences, Special Topics","Computer Science Artificial Intelligence
Journal title
ISSN journal
08247935
Volume
13
Issue
1
Year of publication
1997
Pages
126 - 143
Database
ISI
SICI code
0824-7935(1997)13:1<126:DRLDNS>2.0.ZU;2-E
Abstract
Currently popular reinforcement learning methods are based on estimati ng value functions that indicate the long-term value of each problem s tate. In many domains, such as those traditionally studied in Al plann ing research, the size of state spaces precludes the individual storag e of state value estimates. Consequently, most practical implementatio ns of reinforcement learning methods have stored value functions using generalizing function approximators, with mixed results. We analyze t he effects of approximation error on the performance of goal-based tas ks, revealing potentially severe scaling difficulties. Empirical evide nce is presented that suggests when difficulties are Likely to occur a nd explains some of the widely differing results reported in the liter ature.