ITA
ENG

Interleaved versus a priori exploration for repeated navigation in a partially-known graph

Authors

Argamon-Engelson, S Kraus, S Sina, S

Citation

S. Argamon-engelson et al., Interleaved versus a priori exploration for repeated navigation in a partially-known graph, INT J PATT, 13(7), 1999, pp. 963-986

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE

ISSN journal

02180014 → ACNP

Volume

Issue

Year of publication

1999

Pages

963 - 986

Database

ISI

SICI code

0218-0014(199911)13:7<963:IVAPEF>2.0.ZU;2-V

Abstract

In this paper, we address the tradeoff between exploration and exploitation for agents which need to learn more about the structure of their environme nt in order to perform more effectively. For example, a software agent oper ating on the World Wide Web may need to learn which sites on the net are mo st useful, and the most efficient routes to those sites. We compare explora tion strategies for a repeated task, where the agent is given some particul ar task to perform some number of times. Tasks are modeled as navigation on a partially known (deterministic) graph. This paper describes a new utilit y-based exploration algorithm for repeated tasks which interleaves explorat ion with task performance. The method takes into account both the costs and the potential benefits (for future task repetitions) of different explorat ory actions. Exploration is performed in a greedy fashion, with the locally optimal exploratory action performed during repetition of each task. We ex perimentally evaluated our utility-based interleaved exploration algorithm against a heuristic search algorithm for exploration before task performanc e (a priori exploration) as well as a randomized interleaved exploration al gorithm. We found that for a single repeated task, utility-based interleave d exploration consistently outperforms the alternatives, unless the number of task repetitions is very high. In addition, we extended the algorithms f or the case of multiple repeated tasks, where the agent has a different, ra ndomly-chosen task (from a known subset of possible tasks) to perform each time. Here too, we found that utility-based interleaved exploration is clea r in most cases.