
Standard Mountain Car
The Best Sellers recently underwent some revisions to ensure
compatability with the release of RL-Glue 2.0. Check out RL-Glue 2.0's
new website
(same as old link) for more details.
Download
Project Details
Release
Date: April 1, 2007
Version:
2.0
RL-Glue
Compatibility: version 2.0
Language:
C++
Contents:
Mountain car environment program, Sarsa lambda tile coding agent
program, experiment program
Instructions: unzip into
Examples directory of latest rl-glue distribution, then make and
run:
>>make
>>./RL_glue
Mountain Car Benchmarks
Standard
Mountain Car with random starts:
|
|
Online
performance: Average
reward per episode |
Asymptotic
performance: Average
reward per episode |
1
|
SarsaLambda
[White, 2007]
|
-122.28 (standard error = 0.67)
|
-53.92 (standard error = 0.37)
|
2
|
...
|
|
|
Standard
Mountain Car with bottom starts:
|
|
Online
performance: Average
reward per episode |
Asymptotic
performance: Average
reward per episode |
1
|
SarsaLambda
[White, 2007]
|
-212.26 (standard error = 0.74)
|
-106.59 (standard error = 0.17)
|
2
|
...
|
|
|
History
The Mountain Car task was originally
proposed by Andrew Moore in his PhD disertation (
1990).
Singh and Sutton later used
Mountain Car in their work on eligibility traces (1996).
Singh
and Sutton formalized the state update equations for the position and
velocity of the car based on Moores original problem specification.
They also made their Mountain Car implementation available online.
Mountain Car has become a popular test-bed for reinforcement learning
aglorithms, especially for work on function approximation. Over the
years there have been several variations on Singh and Sutton's version of the
problem: different reward functions, starting states and
termination conditions. The following list highlights the variety of
Mountain Car problems studied in the literature: Smart and Kaelbling (2000),
Boyan and Moore(1995),
Wiewiora
et al. (2003),
Riedmiller (2005),
Bagnell (2004), and Sutton (1996). The version of Mountain Car used in "Reinforcement
Learning: An Introduction" (
1998)
is identical to the one used by
Singh
and Sutton.
Summary
The Mountain Car task is one of the
most widely used reinforcement learning test beds used in machine
learning research and classes. We provide a Standard version of
Mountain Car, based on the
Singh and Sutton
description
(1996).
We have choosen to use this Mountain Car specification because it
is the most widely used variant of the problem and is also based on the
first publically available problem description that fully specified the
state transisiton dynamics. We set the
first benchmark for this domain with a simple Sarsa (lambda) control
agent with tile coding described in Figure 8.8 of
Sutton
and Barto.
The Standard Mountain Car problem is
fully specified in Example 8.2 of the book "Reinforcement Learning: An
Introduction", by
Sutton
and Barto.The environment program allows
episodes to begin with the car at the bottom of the hill with zero
velocity or with the car at a random position and velocity. The later
is done to make it
impossible for deterministic strategies to solve the task. Switching
between bottom and random starts is
controlled by a flag in MountainCar.h. The Standard Mountain Car
Project reports online and asymototic performance measures based on 100
independent runs. For the online performance, the agent in trained for
100 episodes and its average reward per episode is recorded. For the
asymptotic performance, the agent is trained for 10000 episodes, then
its policy is frozen (learning turned off) and its average reward per
episode over 100 episodes is recorded. The Standard Mountain Car
Project includes a Sarsa TD-Lambda control agent.
Write an online review
for the Mountain Car Best Seller (Review
will appear below)
Submitted Reviews: