Question: What does Observation
mean? Why does the RL-Glue not pass
around "states"?
Answer: Observation is a
more general term, to which the concept of
state, and state of say mountain car is a subset. Observation can be an
array of doubles or an int or whatever. This is controlled in the
common types file for each agent-environment pair.
Question: Where is the
"environmental state" stored in RL-Glue? In
other systems, such as CLSquare, the old state is passed to the
environment step function.
Answer: The environment in
RL-Glue is responsible for keeping
track of the current "state" and computing the next "state" given an
action. Old state need not be passed. The state stays within the
environement. The
next_state
method in CLSquare is basically the same as
env_step in RL-Glue.