Reinforcement Learning and
Artificial
Intelligence (RLAI)
RL
Interface Development Page
The ambition of this
page is to be a forum where current development on the Reinforcement
Learning Interfaced
can be discussed and the issues and questions made clear for possible
answers by others in the group. It is hoped that changes to the code
(even very buggy pre-beta versions) can be made available here or at
least announced here, and that people who have an interest in the RL
interface or are working on versions of it themselves can contribute to
the overall design and implementation of it.
Issues with the RLinterface right now:
Speed -
Agent and Environment Functions: having just one function for
each of the agent and environment seems to be a bit restrictive,
forcing people to check for the start of an episode inside the
function. We are going to try a version with 2 functions for each, one
for the start of an episode, and one for the step function.
Tasks:
Modify to use 2 functions for each of agent and env: start and
step
done: new calling sequence RLinterface(agentStartFn,
agentStepFn, envStartFn, envStepFn)
Develop regression/progression tests
Develop simple examples for users
simple test using no objects and no learning (just shows
calling sequences of the various functions)
random walk agent and environment test, using agent and
environment as objects
others?
Polish code
made minor change - now checks internally stored self.s to see
if it is terminal rather than checking to see if next action is None -
the latter requires that the user do something or the RL interface
won't stop
Get timing information
Try to increase efficiency
changed for loops to while loops
split step into regular start (with check for new episode) and
stepnext