Reinforcement Learning and Artificial Intelligence (RLAI)
Earlier meetings of the U Alberta RLAI group

Maintained by Anna Koop

This page holds a record of UofA RLAI-group meetings from Dec 2003 to Sep 2005.  Later meetings are described at the current group meeting page.

This page was formerly an administrative page for the group.  That functionality has been moved up the RLAI group's home page.

Date
Time
Subject
Presentor
Location
Dec 22, 2003
Projects, GWMP introduction

Jan 05, 2004
Stumbling blocks for AI; how we can make progress

Jan 12, 2004
Discussion of administrative stuff and definitions

Jan 19, 2004
GWMP and the Corner World

Jan 26, 2004
TWIKI, GWMP model measures

Feb 02, 2004
GWMP 8x8 gridworld; complete controllers

Feb 09, 2004

Satinder visits; building a dog, POMDP's and petri-nets

Feb 23, 2004
Doina Precup to visit; Banff retreat possibility

Mar 01, 2004
Scope of the group; Projects

Mar 08, 2004
ProjMoreects; PSUM's

Mar 15, 2004
Cap tests; PSUM's

Mar 29, 2004
Doina visits; April fools?

Apr 26, 2004
Rich back; catch up

May 03, 2004
New people; TD learning

May 10, 2004
Dale talks about MDP's

May 17, 2004
Michael talks about robots

May 25, 2004
Robot learning

May 31, 2004
Retreat; TD Networks

Jun 14, 2004
TD Networks

Jun 21, 2004
Tushie wiggling; Null hypothesis

Jun 28, 2004
Muliplicity automata; Animal pyschology

Jul 13, 2004
John Langford talks about RL via reduction

Jul 20, 2004
Robots in the hallways, updates

Aug 03, 2004
History in TDnets; RLAI competition

Aug 11, 2004
Tshirts; Two phase learning in TD nets

Aug 23, 2004

TD nets vs Elman and other nets

Aug 29, 2004

TD nets and options; Mark Ring visiting

Sep 07, 2004

Knowledge Representation

Sep 16, 2004

Subjective Aibo maps

Sep 23, 2004

Segway is here!; Constructing feature vectors

Sep 30, 2004

News from Michigan; TDnets and options

Oct 07, 2004

Subjective Robotics; Semi-definite Programming

Oct 14, 2004

The Big Picture; Evolving Reward Functions

Oct 21, 2004

Brainstorming; Features; Whack-a-mole

Oct 28, 2004

TDnets are fixed!!; Go and TDnets

Nov 04, 2004

Monte Carlo Networks

Nov 18, 2004

Whole picture in a glance

Nov 25, 2004

Software; Continuous time; New space

Dec 02, 2004

Possible papers; Function approximation

Dec 09, 2004

RL and the Budgeted Learning Problem

Jan 06, 2005

Papers; Meeting schedules

Jan 17, 2005

Paper updates; Software

Jan 31, 2005

Upcoming RLAI Group Meeting schedule

Feb 14, 2005

Predictive Representations for Generalization

Feb 22, 2005

TD nets with history

Feb 28, 2005

The Predictive Model of Intelligence

Mar 14, 2005


Martin Zinkevich

Mar 21, 2005


Cosmin and Vadim

Apr 20, 2005

fallout from Barbados, Cosmin prepares handout of equations


May 05, 2005

Introductions and presentation of robot
Adam Milstein

Jun 09, 2005

administrative details


Jun 16, 2005

Gaussian Process Regression for Optimization & (un)supervised multiclass support vector machines
Linli, Dale, Tao, Dan

Jun 23, 2005

Controling Octopus Arm w/ Gaussian Process Temporal Difference and RL in the shape of GO
Yaakov and David

Jul 07, 2005

Bayes Meets Bellman: Gaussian Process and Temporal Difference
Yaakov Engel

Jul 14, 2005

Bayes Meets Bellman: Gaussian Process and Temporal Difference (c'ont)
Yaakov Engel

Jul 21, 2005

Racing Simulation (RARS)
Tom

Aug 04, 2005
12:01pm - 1:00pm
Predicitve Knowledge Representations
Rich Sutton
CSC 333
Aug 18, 2005
1:00pm - 2:00pm
Recap after conferences
Rich Sutton
CSC 249
Aug 29, 2005
11:30pm - 1:00pm
Natural Gradient
Dan Lizotte
CSC 333
Sep 1, 2005
Holiday
No meeting, Labour Day





AAAI'05 had a number of presentations relevant to RL in general, abstraction in RL, and options. I have posted my impressions here:

http://www.cs.ualberta.ca/~bulitko/ircl/essays/pres/2005-07-20-AAAIi.pdf

I have the proceedings in an electronic format, if anyone is curious. Comments are welcome, Vadim Bulitko.  

Sham Kakade will be visiting us during Nov. 7-8.

Below is Kakade's talk abstract.

Title:
A Natural Policy Gradient

Abstract:

I'll discuss a natural gradient method that represents the steepest
descent direction based on the underlying statistical structure of the
parameter space. This method has some natural connections to
approximate policy iteration with "compatible" value functions (where
compatibility is defined with respect to the policy class
parameterization, as defined by Sutton et. al.).  Experimental
results, from some simple MDPs and from the more challenging MDP of
Tetris, will be discussed. I'll also try to summarize some of the
recent subsequent work (done by myself and others) on policy gradient
methods.  

Khashayar Rohanimanesh (Khash) will be visiting us on Nov. 14 and 15.
Khash has recently received his Ph.D. from University of Massachusetts Amherst under the supervision of Sridhar Mahadevan. Below is the title and abstract of his talk.

TIME: Monday Nov. 14, 12:00 to 1:00pm, CSC 333

TITLE: Concurrent Decision Making in Markov Decision Processes

ABSTRACT
--------

Concurrent decision making and coordination has been recognized as a
fundamental problem in many areas of robotics, control, and computer
science.  In the field of Artificial Intelligence in particular, this
problem is recognized as a formidable challenge. By concurrent
decision making we refer to a class of problems that require agents to
accomplish long-term goals by concurrently executing multiple
activities. In general, the problem is difficult to solve as it
requires learning and planning with a combinatorial set of interacting
concurrent activities with uncertain outcomes that compete for limited
resources in the system.

In this talk we present a general framework for modeling the
concurrent decision making problem based on semi-Markov decision
processes (SMDPs). Our approach is based on a centralized control
formalism, where we assume a central control mechanism initiates,
executes and monitors concurrent activities. This view also captures
the type of concurrency that exists in single agent domains, where a
single agent is capable of performing multiple activities
simultaneously by exploiting the degrees of freedom (DOF) in the
system. We present a set of coordination mechanisms employed by our
model for monitoring the execution and termination of concurrent
activities. Such coordination mechanisms incorporate various natural
activity completion mechanisms based on the individual termination of
each activity. We provide theoretical results that assert the
correctness of the model semantics which allows us to apply standard
SMDP learning and planning techniques for solving the concurrent
decision making problem.  

Theoretically, standard SMDP solution methods do not scale to
concurrent decision making systems with large degrees of freedom. This
problem is a classic example of the curse of dimensionality in the
action space, where the size of the set of concurrent activities
exponentially grows as the system admits more degrees of freedom. To
alleviate this problem, we develop a novel decision theoretic
framework in spirit motivated by the coarticulation phenomenon
investigated in speech and motor control research. We show that by
applying coarticulation to systems with excess degrees of freedom,
concurrency is naturally generated. We present a set of theoretical
results that characterizes the efficiency of the concurrent decision
making based on the coarticulation framework when compared to the case
in which the agent is allowed to only execute activities sequentially
(i.e., no coarticulation). We also present a set of techniques for
scaling the coarticulation framework to large domains. We empirically
evaluate our algorithms in a set of simulated domains ranging from an
agent navigating in a grid world performing concurrent activities, to
a simulated domain with multiple degrees of freedom that is capable of
performing tasks concurrently.  

This is the page that USED TO hold the mailing list for the RLAI group.  There is no need to unsubscribe from this page, but its function has been taken over by a new page:

    http://rlai.cs.ualberta.ca/RLAI/RL_PP.html

You need to subscribe to the new page to stay informed of RLAI events.