Harvard Logo


Project Guidelines and Due Dates

Announcements:

Please email Yiling and Alice your presentation files by midnight of Sunday December 5.

Students will be giving brief project presentations at 1pm-3pm on Monday December 6 in MD221. Every project presentation will be 8 minutes long (7 minutes presentation + 1 minute Q&A). Students are required to attend the project presentations.

Tentative project due dates:

For project proposals, students must include the following four sections for non-exposition papers:

  1. Describe at a high level what you will do and how does it relate to the material covered in class? Citations to related work and a description about how your proposed work relates.
  2. Define the exact problem you will be studying as carefully and completely as you can. E.g. if a theory problem - give a concrete problem definition. If an experimental problem, what code will you write, what simulations will you run, and what graphs will you generate?
  3. State the concrete first two steps in moving forward with your project. What could you start tomorrow?
  4. What is unclear to you? What can we help with? (E.g., did you find a related literature, do you need help with scoping the work, is there something that you're confused about?)

For an exposition paper, what are the two papers you will study and which two technical results in those papers will you describe? How are the papers related to the class, and to each other? Why do you find the papers interesting?

Project Ideas

  1. Proper Scoring Rules: It's rather a black art to choose the appropriate scoring rule for certain applications.  Choose an application, and find several angles to compare some candidate proper scoring rules.
  2. Exposition paper on methods to combine subjective probabilities.
  3. The signature function mentioned in Nicolas Lambert's paper.  The proof is non-constructive.  However, as long as one scoring rule is known, we could derive the signature function.  Can we somehow study the signature functions that we could derive?
  4. Eliciting Coarse Report in the Peer Prediction Method: Even though everything could potentially be discretized, sometimes we might want to elicit information at a rather coarse level. Miller et al gave an example for how to elicit truthful information when there are 2 types of products. Could you study whether coarse information could be truthfully elicited in some more complex environments?
  5. Simulations of peer-prediction method:  The various methods we covered all relied on certain assumptions.  Can you relax some of these assumptions and run some simulations to see how the methods perform in practice?  For instance, relax the common prior assumption.
  6. Peer-prediction method:  This method did not account for different tastes of raters.  The authors discussed briefly how you can explicitly taking account of this factor.  Can you augment the method to explicitly account for this?  Or can you do an experiment to study how the center could efficiently estimate the different tastes of individuals?
  7. Bayesian Truth Serum:  This method assumes that we have a countably infinite number of participants.  Can we use simulations to study the effect of the number of participants on the accuracy of the prediction generated by this method?
  8. Evaluate value of information in prediction market:  We covered some concepts in information theory, and we could potentially use these concepts to evaluate the value of information contributed by the participants. Can you think of a way to study the value of the information contributed to a prediction market by participants?
  9. Learning in Peer Prediction:  Each Peer Prediction method essentially proposes a game among the participating agents.  If the mechanism is reasonably complex, then the agents might not be able to compute their optimal strategy first time around.  Can we allow the agents to participate in the peer prediction method repeatedly and gradually learn what's the optimal strategy?  For instance, fix the strategies of some honest or dishonest players, and see how the other players behave in this repeated setting.
  10. How to set the liqudity parameter "b" in logarithmic market scoring rule is more of an art than a science. Othman et. al. proposed a liquidity sensitive market maker that increases b dynamically as orders coming in. Can you think of a general method to adapt the liquidity of the market for other market scoring rule market makers?
  11. Combining prediction market with alternative forecasting method:  The paper Prediction Without Market compared the performance of prediction market with alternative statistical forecasting methods.  We also learned that the market scoring rule can be essentially explained as a no-regret learner.  Can you think of ways to enhance the performance of prediction methods by hybriding it with other statistical or learning methods, like a unified approach to prediction or something.
  12. Combinatorial Prediction Markets: In theory, combinatorial prediction markets can potentially aggregate more refined information. However, interacting with a combinatorial prediction market may increase the cognitive cost of participants. Perform an experimental study to examine whether combinatorial prediction markets aggregate information better than simple prediction markets.
  13. Exposition paper on online learning and online portfolio selection (e.g. universal portfolio management).
  14. Typically, supervised learning assumes that training data are drawn i.i.d. from some distribution. However, the reality is that data can be provided by heterogenous sources each having different quality. This setting is espeically relevant to crowdsourcing. Write an exposition paper on learning from the "crowds" (heterogenous sources).
  15. DARPA Network Challenge: The MIT team's mechanism is not sybil proof. That is, one who finds a balloon can create multiple identities and get higher reward. Can you think of an improved mechanism to achieve sybil proofness and analyze properties of the resultant mechanism?
  16. Crowdsourcing: DiPalantino et. al. model online crowdsourcing sites as all-pay auctions. Study some variations of the mechanism, e.g. what if the mechanism rewards more than one participants for each competition.
  17. Pick some online crowdsourcing systems and compare how they work. Point out some inefficiency in the system and suggest how this can be remedied through appropriately designed incentives.
  18. Survey research using Amazon Mechanical Turk (AMT) and suggest an experiment that can be conducted using AMT.