An architecture for learning "potential field" cognitive maps with an application to mobile robotics

Authors
Citation
Ag. Pipe, An architecture for learning "potential field" cognitive maps with an application to mobile robotics, ADAPT BEHAV, 8(2), 2001, pp. 173-203
Citations number
59
Categorie Soggetti
Psycology
Journal title
ADAPTIVE BEHAVIOR
ISSN journal
10597123 → ACNP
Volume
8
Issue
2
Year of publication
2001
Pages
173 - 203
Database
ISI
SICI code
1059-7123(200121)8:2<173:AAFL"F>2.0.ZU;2-T
Abstract
The learning architecture described in this article autonomously acquires a topographical (metric) map that encodes a measure of "value" for xy-Cartes ian locations in an environment. There are two reasons for the creation of low value areas. Direct negative reinforcement from the environment will re sult from the robot discovering obstacles or having other "unpleasant" expe riences. The other source of negative reinforcement is internally generated by the learning algorithm, as it identifies regions that are a long distan ce away from the "pleasant" places in the environment. Conversely example " pleasant" places, where positive environmental reward is received, might be energy-charging sites or simply locations that the robot should visit in e xecuting its daily tasks. In general what the robot learns is a map of "mot ivational" tendencies, or "expectancies". In such a map, the value attached to a place comes to reflect a balance between the good and bad rewards att ainable from that position. When the Temporal Difference learning part of t he architecture is turned on, that measure of value comes to include an est imate of how far, in travel time, it is to positive reinforcement. The arch itecture is loosely based on an Adaptive Heuristic Critic structure. Explor ation of a continuous-valued search space is conducted by an Evolution Stra tegy, tuned for fast and approximate optimization. Knowledge acquired auton omously from this exploration is stored in a Radial Basis Function (RBF) ne ural network. Inherent features of this neural network type lead to the cre ation of a "potential field" structure that exerts appetitive and aversive "forces" on the robot as it moves around in the environment. The results of simulation experiments are presented, with a view to illustrating the stre ngths and weaknesses of the architecture. The map building architecture pro posed here is intended to form part of an overall navigational system. In f uture work it will be integrated with a self-localization algorithm, landma rk-based topological mapping, and a reactive system for dealing with local dynamics in the environment.