A model of hippocampally dependent navigation, using the temporal difference learning rule

Citation
Dj. Foster et al., A model of hippocampally dependent navigation, using the temporal difference learning rule, HIPPOCAMPUS, 10(1), 2000, pp. 1-16
Citations number
56
Categorie Soggetti
Neurosciences & Behavoir
Journal title
HIPPOCAMPUS
ISSN journal
10509631 → ACNP
Volume
10
Issue
1
Year of publication
2000
Pages
1 - 16
Database
ISI
SICI code
1050-9631(2000)10:1<1:AMOHDN>2.0.ZU;2-Z
Abstract
This paper presents a model of how hippocampal place cells might be used fo r spatial navigation in two watermaze tasks: the standard reference memory task and a delayed matching-to-place task. In the reference memory task, th e escape platform occupies a single location and rats gradually learn relat ively direct paths to the goal over the course of days, in each of which th ey perform a fixed number of trials. In the delayed matching-to-place task, the escape platform occupies a novel location on each day, and rats gradua lly acquire one-trial learning, i.e., direct paths on the second trial of e ach day. The model uses a local, incremental, and statistically efficient c onnectionist algorithm called temporal difference learning in two distinct components. The first is a reinforcement-based "actor-critic" network that is a general model of classical and instrumental conditioning, In this case , it is applied to navigation, using place cells to provide information abo ut state. By itself, the actor-critic can learn the reference memory task, but this learning is inflexible to changes to the platform location. We arg ue that one-trial learning in the delayed matching-to-place task demands a goal-independent representation of space. This is provided by the second co mponent of the model: a network that uses temporal difference learning and self-motion information to acquire consistent spatial coordinates in the en vironment. Each component of the model is necessary at a different stage of the task; the actor-critic provides a way of transferring control to the c omponent that performs best, The model successfully captures gradual acquis ition in both tasks, and, in particular, the ultimate development of one-tr ial learning in the delayed matching-to-place task. Place cells report a fo rm of stable, allocentric information that is well-suited to the various ki nds of learning in the model. Hippocampus 2000;10:1-16. (C) 2000 Wiley-Liss , Inc.