ITA
ENG

Rationality of reward sharing in multi-agent reinforcement learning

Authors

Miyazaki, K Kobayashi, S

Citation

K. Miyazaki et S. Kobayashi, Rationality of reward sharing in multi-agent reinforcement learning, NEW GEN COM, 19(2), 2001, pp. 157-172

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

NEW GENERATION COMPUTING

ISSN journal

02883635 → ACNP

Volume

Issue

Year of publication

2001

Pages

157 - 172

Database

ISI

SICI code

0288-3635(2001)19:2<157:RORSIM>2.0.ZU;2-W

Abstract

In multi-agent reinforcement learning systems, it is important to share a r eward among all agents. We focus on the Rationality Theorem of Profit Shari ng(5)) and analyze how to share a reward among all profit sharing agents. W hen an agent gets a direct reward R (R > 0), an indirect reward muR (mu gre ater than or equal to 0) is given to the other agents. We have derived the necessary and sufficient condition to preserve the rationality as follows; mu < M-1/M-W(1 - (1/M)(W)(0))(n - 1)L' where M and L are the maximum number of conflicting all rules and rational rules in the same sensory input, W and W-0 are the maximum episode length o f a direct and an indirect-reward agents, and n is the number of agents. Th is theory is derived by avoiding the least desirable situation whose expect ed reward per an action is zero. Therefore, if we use this theorem, we can experience several efficient aspects of reward sharing. Through numerical e xamples, we confirm the effectiveness of this theorem.