Download
Here
First Author HomePage
Abstract:
Imitation can be viewed as a
means of enhancing learning in multi agent
environments. It augments an agent's ability to learn useful
behaviors by making intelligent use of the knowledge implicit in
behaviors demonstrated by cooperative teachers or other more
experienced agents. We propose and study a formal model of implicit
imitation that can accelerate reinforcement learning dramatically in
certain cases. Roughly, by observing a mentor, a
reinforcement-learning agent can extract information about its own
capabilities in, and the relative value of, unvisited parts of the
state space. We study two specific instantiations of this model, one
in which the learning agent and the mentor have identical abilities,
and one designed to deal with agents and mentors with different action
sets. We illustrate the benefits of implicit imitation by integrating
it with prioritized sweeping, and demonstrating improved performance
and convergence through observation of single and multiple mentors.
Though we make some stringent assumptions regarding observability and
possible interactions, we briefly comment on extensions of the model
that relax these restrictions.
Keywords: Reinforcement Learning, Implicit imitation, Markov
Decision Processes, Approximate Policy Iteration.