TEMPORAL DIFFERENCE-METHODS AND MARKOV-MODELS

Authors
Citation
E. Barnard, TEMPORAL DIFFERENCE-METHODS AND MARKOV-MODELS, IEEE transactions on systems, man, and cybernetics, 23(2), 1993, pp. 357-365
Citations number
14
Categorie Soggetti
Controlo Theory & Cybernetics","Computer Applications & Cybernetics
ISSN journal
00189472
Volume
23
Issue
2
Year of publication
1993
Pages
357 - 365
Database
ISI
SICI code
0018-9472(1993)23:2<357:TDAM>2.0.ZU;2-J
Abstract
The relation between temporal-difference training methods and Markov m odels, which was first noticed by Sutton, is explored. This relation i s derived from a new perspective, and in this way the particular assoc iation between conventional temporal-difference methods and first-orde r Markov models is explained. We then derive a generalization of tempo ral-difference methods that is suitable for Markov models of higher or der. Finally, several issues related to the performance of mismatched temporal-difference methods (i.e., the performance when the temporal-d ifference method is not specifically designed to match the order of th e Markov model) are investigated numerically.