3/1 FP, non-convergence - explore how manipulable FP really is - implement and test proposed extensions to FP to see how convergent they are - think about MD to make FP convergent 3/6 Nash Q-learning - apply to a repeated game with extended state to include a little history, can it converge to trigger strategies? - run Nash Q on a VCG style setting, does it converge? - think about a model-based alternative to Nash Q - is there something more like FP that can work in SGs? 3/8 Calibrated learning - implement the calibrated learning algorithm (e.g. following discussion in Greenwald, Jafari and Marks) see how quickly it converges - read the Hart & Mas-Collel paper, implement that and test - test the manipulability of calibriated learning - think about MD to allow CL to find "good" CE 3/13 Correlated Q-learning; Cyclic equilibria - how to correlate play in correlated Q-learning? - how to distribute choice of CE? - speed up convergence in correlated Q-learning? - explore possibility of manipulation 3/15 Rational learning; beliefs in repeated games - empirical study of how beliefs impact utility in game play - compute complexity of Kalai & Lehrer approach in small setting 3/20 Efficient learning equilibrium - study ELE with discounting - study ELE with agent beliefs - study ELE subgame perfect - study simpler ELE's that use the folk theorem in different way - empirical study of ELE 3/22 Nash memory - try different search algorithms in place of GA - apply techniques in both papers to different, perhaps smaller, well-motivated games (domains) [note: compute time of Phelps is huge!] 4/3 Agenda of research in multi-agent learning - write your own "on the agenda" - instantiate authors' ideas on teaching in an experimental setting - try to relax one or more of the 5 assumptions in the "Learning against opponents..." paper - examine other models of bounded memory - explore the case of alternative metrics, such as no-regret 4/10 Hayek machine - compare Holland classifiers vs. Hayek (empirical) - consider hierarchical RL in Hayek - consider decoupling of strategy selection, evaluation - consider implications of second price vs. first price rules - examine coevolution of problem instance and learning - study transfer learning (from one domain to another) 4/12 Partially-controlled multi-agent systems - collusion amongst malicious agents. two teams, designing competing "P"s - methods to share info across "P" agents - effect of noisy info on profits - changes to assumptions about monitoring - other goals for teaching - relaxing assumptions in part 2 of paper - teaching, non-Q learners - new algprithms for Q-learning eith teacher-learner - multi-player 4/17 Inverse RL - mechanism design with IRL