 |
Reinforcement Learning and
Artificial
Intelligence (RLAI)
|
RL-Glue
2.0 Utilities
|
Edited by Leah Hackman
The ambition of this web
page is to discuss the current state, uses and future of RL-Glue 2.0
Utilities.
Writing in C/C++ and using the direct connect version of RL-Glue
will buy you a number of advantages. The direct connection will provide
a definite speed up in experiments however another advantage is the
RL-Glue Utilities. To help save you time, we have included some bits of
code we've found are frequently useful as utilities. These bits of code
have been included in the Utilities folder in the RL-Glue download. As
these utilities are not a formal part of RL-Glue we will not be
converting them into other languages. If you write similar utilities
for your own code, in any language, feel free to pass them on our way
so that they may be shared with the entire community.
Included in the Utilities Folder:
All the code resides in the Glue_utilities.c(pp) and
Glue_utilities.h files. Currently these files include a parser for the
task_specification. In Glue_utilities.h is the definition of a struct
to contain all the information to be extracted from the task_spec. To
gain access to the parser all you need to do is #include
"Glue_utilities.h" and call the function parse_task_spec. The signature for
the function is:
void
parse_task_spec( char *, task_spec_struct * )
The
first
arguement is the task_specification string to be parsed and the second
is a pointer to the task_spec_struct to hold all the details in the
task_specification.
A call to parse_task_spec will have all the internal
details of task_spec_struct filled out according to the provided
task_specification. These "mysterious" internal details alluded
to are defined in Glue_util.h as follows:
typedef struct
{
float
version; //This
is the version number
char episodic; //This
is an e if the task is episodic and a c if the task is continuous
int
obs_dim;
//The number of observation variables
int num_discrete_obs_dims;
//The number of observation variables declared as
ints
int num_continuous_obs_dims; //The
number of observation variables declared as floats
char *obs_types; //A character array
ordering the ints and doubles: ie) ifi would mean the observations
contain
//one
int, one float, and one int.
double *obs_mins;
//Contains the minimum values for each observation
variable, contains DBL_MIN
for an unspecified min
double
*obs_maxs;
//Contains the maximum values for each observation
variable, contains DBL_MAX
for an unspecified max
int action_dim;
//The number of action variables
int
num_discrete_action_dims; //The
number of action variables declared as ints
int num_continuous_action_dims; //The number of action
variables declared as doubles
char *action_types;
//A
character array ordering the ints and doubles: ie) ifi would mean the
actions contain
// one int, one float, and one int.
double
*action_mins;
//Contains the min
values for each action variable, contains DBL_MIN for an unspecified min
double *action_maxs;
//Contains the max values
for each action variable, contains DBL_MAX for an unspecified max
} task_spec_struct;
Though perhaps it is
best to go through a couple of examples. We will need to create an
Action struct to write an agent, however we may not know how many
variables an Action is represented by at compile time. The following
code will parse the task specification and then access the number of
discrete action variables and continuous action variables to create an
Action struct.
#include "Glue_utilities.h"
Action oldAction;
task_spec_struct ps;
void agent_init(Task_specification ts)
{
parse_task_spec(ts, &ps);
oldAction.numInts
= ps.num_discrete_action_dims;
oldAction.numDoubles = ps.num_continuous_action_dims;
oldAction.intArray
= new int [oldAction.numInts];
oldAction.doubleArray = new double[oldAction.numDoubles];
}
After parse_task_spec fills in all the necessary
details, the desired information is extracted from the struct using the
dot . operator . This is first
seen in oldAction.numInts =
ps.num_discrete_action_dims,
where we access the number of discrete actions by using the dot
operator on the task_spec_struct named ps.
Perhaps later in the code, we want to ensure the first Action variable
is within it's range. We could this like so:
if(
strncmp(ps.obs_types[0], "i") == 0){
if(oldAction.intArray[0]< ps.action_mins[0])
printf("Your first action variable is invalid: Integer
representation too low");
if(oldAction.intArray[0]> ps.action_maxs[0])
printf("Your first action variable is invalid: Integer
representation too high");
}
else if(strncmp(ps.obs_types[0], "f") == 0){
if(oldAction.doubleArray[0] <
ps.action_mins[0])
printf("Your first action variable
is invalid: Double
representation too low");
if(oldAction.doubleArray[0]> ps.action_maxs[0])
printf("Your first action variable is invalid: Double
representation too high");
}
First we check to see if the first element of our action is an
integer or a double by checking the obs_types character array. Then we
access the action_mins and action_maxs using the . operator again and then
accessing the first element using the array [] brackets to index in.
The last thing you need to know about using the
parser is how to add it in to the compilation process. This will depend
on if you are using a direct or socketed approach. Directions on how to
add external code is provided in the Writing instructions here.