 |
Reinforcement Learning and
Artificial
Intelligence (RLAI)
|
The
Gridworld Demo
|
The ambition of this
page is to be a forum where current development on the Gridworld demo
can be discussed and the issues and questions made clear for possible
answers by others in the group. It is hoped that changes to the code
(even very buggy pre-beta versions) can be made available here or at
least announced here, and that people who have an interest in the
gridworld or are working on versions of it themselves can contribute to
the overall design and implementation of the demo.
Current code can be obtained by going to the RLtoolkit
cvs repository.
Note that this code may (and probably will) contain bugs. Please report
any true bugs (exceptions, obviously wrong behavior), and anything
which just doesn't seem intuitive, or can be improved in some way. We
want this demo to be robust and easy for others to adapt.
Issues with gridworld right now:
- Speed - of both the GUI interface and the underlying gridworld
code. Any help improving the speed here would be appreciated. Noone
will want to use a slow demo, no matter how many features it has.
- Objects - we are adding objects to the gridworld, which may
provide positive or negative rewards when the agent lands on them.
Positive objects lead to a feeding frenzy. We need to be able to have
removable or consumable objects, and ideally, a function associated
with each object that indicated how the reward there changes over time,
and what happens when the agent lands on it (object consumed? agent
teleported? etc) How should the functions be added to the objects, how
should they be used, and what kinds of functions are we looking at?
- Handling agents with different learning methods. We have a menu
where different agents can be selected. Are there others that should be
there? Are the ones there working as expected?
- Handling different types of gridworlds easily, and having easy
access to the display for new types of gridworlds.
- Should we allow multiple gridworlds to run at once? Or restrict
the user to exactly 1?
- and more ... If you have an issue you want addressed, you can add
it here.