RLAI Reinforcement Learning and Artificial Intelligence (RLAI)
Chapter 5 Reading and Exercises

Reading:

    Due: Monday, October 18th - 10 PM

    Due: Tuesday, October 19th

Written Exercises:

    Due: Thursday, October 21st


Programming Exercise:

    Due: Tuesday, October 26th

1) Run the Monte Carlo ES algorithm (download here) on the party problem for 10 000 episodes and compare (discuss) the final policy and value function with your results with value iteration from the last programming assignment.  The command to unpack the tar file is: tar -xf monteCarloProgramming.tar

2) Modify the Monte Carlo ES code to use the incremental averaging methods from section 2.5 (and reprised in exercise 5.5), turn in your modified python code.  Run your code on the party problem for 10 000 episodes and hand in the output from the program.  Also show the learned action values (Q).

3) Did you notice any difference in the runtime speed between the two implementations?  Discuss why there might be a difference?

   Submission instructions

Please hand in a hard copy of the assignment at the beginning of class on the due date. 
Back to Main Page