3/1 FP, non-convergence
     - explore how manipulable FP really is
     - implement and test proposed extensions to FP to see 
       how convergent they are
     - think about MD to make FP convergent

3/6 Nash Q-learning
     - apply to a repeated game with extended state to include
       a little history, can it converge to trigger strategies?
     - run Nash Q on a VCG style setting, does it converge?
     - think about a model-based alternative to Nash Q
     - is there something more like FP that can work in SGs?

3/8 Calibrated learning
     - implement the calibrated learning algorithm
       (e.g. following discussion in Greenwald, Jafari and Marks)
       see how quickly it converges
     - read the Hart & Mas-Collel paper, implement that and test
     - test the manipulability of calibriated learning
     - think about MD to allow CL to find "good" CE

3/13 Correlated Q-learning; Cyclic equilibria
     - how to correlate play in correlated Q-learning?
     - how to distribute choice of CE?
     - speed up convergence in correlated Q-learning?
     - explore possibility of manipulation

3/15 Rational learning; beliefs in repeated games
     - empirical study of how beliefs impact utility in game play
     - compute complexity of Kalai & Lehrer approach in small setting

3/20 Efficient learning equilibrium
     - study ELE with discounting
     - study ELE with agent beliefs
     - study ELE subgame perfect
     - study simpler ELE's that use the folk theorem in different way
     - empirical study of ELE

3/22 Nash memory
     - try different search algorithms in place of GA
     - apply techniques in both papers to different, perhaps smaller, 
       well-motivated games (domains) [note: compute time of Phelps 
       is huge!]

4/3 Agenda of research in multi-agent learning
     - write your own "on the agenda"
     - instantiate authors' ideas on teaching in an experimental setting
     - try to relax one or more of the 5 assumptions in the "Learning 
       against opponents..." paper
     - examine other models of bounded memory
     - explore the case of alternative metrics, such as no-regret

4/10 Hayek machine
     - compare Holland classifiers vs. Hayek (empirical)
     - consider hierarchical RL in Hayek
     - consider decoupling of strategy selection, evaluation
     - consider implications of second price vs. first price rules
     - examine coevolution of problem instance and learning
     - study transfer learning (from one domain to another)

4/12 Partially-controlled multi-agent systems
     - collusion amongst malicious agents. two teams, designing competing "P"s
     - methods to share info across "P" agents
     - effect of noisy info on profits
     - changes to assumptions about monitoring
     - other goals for teaching
     - relaxing assumptions in part 2 of paper
     - teaching, non-Q learners
     - new algprithms for Q-learning eith teacher-learner
     - multi-player

4/17 Inverse RL
     - mechanism design with IRL