Back to Best Sellers Shelf

Standard Mountain Park

The Best Sellers recently underwent some revisions to ensure compatability with the release of RL-Glue 2.0. Check out RL-Glue 2.0's new website (same as old link) for more details. 


Download 

Standard Mountain Park Project
(Old Version: pre RL-Glue 2.0)

Project Details

Author: Adam White
Release Date: April 1, 2007
Version: 1.0
RL-Glue Compatability: version 2.0
Language: C++

Contents: Mountain park environment sprogram, Sarsa lambda tile coding agent program, experiment program

*****IMPORTANT NOTE:
The Standard Mountain Park will take much longer to learn and run through than the Standard Mountain Car. If you run the environment and it does not immediately show results, it may not be an issue with your code. Give this environment a few minutes longer than you would the Standard Mountain Car before assuming there is an error.*******

Instructions
: unzip into Examples directory of  latest rl-glue distribution, then make and run:
    >>make
    >>./RL_glue 

Mountain Car Benchmarks


Standard Mountain Park with random starts:


                         
Online performance: Average reward per episode Aysmtotic performance: Average reward per episode

1
SarsaLambda
[White, 2007]

-176.82 (standard error = 1.84)

-156.10 (standard error = 11.90)

2

...





Standard Mountain Park with bottom starts:


                        
Online performance: Average reward per episode Aysmtotic performance: Average reward per episode

1
SarsaLambda
[White, 2007]

-225.16 (standard error = 2.39)

-230.81 (standard error = 13.70)

2

...



Author Review

The Mountain Car problem is one of the classic reinforcement learning test problems. It has been well studied in publications and classes. Mountain Car is an interesting problem for a variety of reason: the agent must deal with delayed affects of its actions, the agent must learn to move away from the goal (incurring more negative reward) to eventually reach the goal, and it provides an excellent benchmark for function approximation techniques. Mountain Car, The task itself, however, is relatively easy to solve. An agent that simply selects its actions to be the same as the direction of its velocity can reach the top of the hill and a fixed number of steps. In an attempt to make Mountain Car more difficult to solve while preserving the interesting properties mentioned above, we have developed a new problem: Mountain Park.

The Mountain Park problem is exactly the same of the Standard Mountain Car, except the goal can only be reached if the car's velocity is less than a tolerance (0.001). If the car reaches the top of the hill with a velocity less than the tolerance the episode ends, otherwise the car is reflected back down the hill. The reflection is completely elastic. The environment program also allows episodes to begin with the car at the bottom of the hill with zero velocity or with the car at a random position and velocity. This is controlled by a flag in MPcar_common.h. The Standard Mountain Park Project reports online and asymptotic performance measures based on 100 independent runs. For the online performance, the agent in trained for 1000 episodes and its average reward per episode is recorded. For the asymptotic performance, the agent is trained for 10000 episodes, then its policy is frozen (learning turned off) and its average reward per episode over 100 episodes is recorded. The Standard Mountain Car Project includes a Sarsa TD-Lambda control agent.


Write an online review for the Mountain Park Best Seller (Review will appear below)

Submitted Reviews: