Reinforcement Learning and
Artificial
Intelligence (RLAI)
Chapter
5 Reading and Exercises
Reading:
Due: Monday,
October 18th - 10 PM
Three thoughtful questions showing that you read the chapter
These will be marked on subjective probability that you have read
the chapter
E-mail these to BOTHc499@ugrad.cs.ualberta.ca
AND rich@richsutton.com with the
subject line
"Thought Questions". Please send a plain text email message with no attachments (you know who you
are)!
Due: Tuesday,
October 19th
Read all of chapter 5
Written Exercises:
Due: Thursday,
October 21st
Exercises 5.1, 5.2, 5.5
Please note that these are exercise numbers from the real
book.
The numbers in the electronic version may differ.
Programming Exercise:
Due: Tuesday,
October
26th
1) Run the Monte Carlo ES algorithm (download here)
on the party
problem for 10 000 episodes and compare (discuss) the final policy and
value function with your results with value iteration from the last
programming assignment. The command to unpack the tar file is: tar -xf monteCarloProgramming.tar
2) Modify the Monte Carlo ES code to use the incremental averaging
methods from section 2.5 (and reprised in exercise 5.5), turn in your
modified python code. Run your code on the party problem for 10
000 episodes and hand in the output from the program. Also show
the learned action values (Q).
3) Did you notice any difference in the runtime speed between the two
implementations? Discuss why there might be a difference?
Submission
instructions
Please hand in a hard copy of the assignment at the beginning of class
on the due date. Back
to Main Page