ITA
ENG

TEMPORAL DIFFERENCE-METHODS AND MARKOV-MODELS

Authors

BARNARD E

Citation

E. Barnard, TEMPORAL DIFFERENCE-METHODS AND MARKOV-MODELS, IEEE transactions on systems, man, and cybernetics, 23(2), 1993, pp. 357-365

Citations number

Categorie Soggetti

Controlo Theory & Cybernetics","Computer Applications & Cybernetics

Journal title

IEEE transactions on systems, man, and cybernetics → ACNP

ISSN journal

00189472

Volume

Issue

Year of publication

1993

Pages

357 - 365

Database

ISI

SICI code

0018-9472(1993)23:2<357:TDAM>2.0.ZU;2-J

Abstract

The relation between temporal-difference training methods and Markov m odels, which was first noticed by Sutton, is explored. This relation i s derived from a new perspective, and in this way the particular assoc iation between conventional temporal-difference methods and first-orde r Markov models is explained. We then derive a generalization of tempo ral-difference methods that is suitable for Markov models of higher or der. Finally, several issues related to the performance of mismatched temporal-difference methods (i.e., the performance when the temporal-d ifference method is not specifically designed to match the order of th e Markov model) are investigated numerically.