Student comments 12/01/2008

 

Paper A: ÒEliciting information feedback: the peer production methodÓ

Paper B: ÒMinimum payments that reward honest reputation feedbackÓ

 

Nick Wells

 

Paper A

 

The authors of this paper lay out the structure of a scoring system designed to elicit honest feedback from raters. They prove that honest scoring is a Nash Equilibrium, though not necessarily a unique one. In (1) the authors make the point that users should be truthful so long as their expected transfer payment from being truthful is greater than not being truthful. They go on to examine different scoring rules showing how they fit within this framework.

 

A potential problem to this system is that scorers may not necessarily believe that honesty is an NE and thus will not abide by that policy. They might also have different priors coming into the system.  There are all sorts of things that could potentially complicate things, and Im not sure we can assume that everyone is perfectly rational.

 

In economics rational ignorance means that the cost of calculating out the system may outweigh the benefit users receive from it. In practice users of scoring systems do not take the time to think things through as thoroughly as this paper does even for simple things. Buyers often want to make their purchase as quickly and conveniently as possible.

 

I also am not sure that this paper discusses in-depth the root of incentives to participate. It operates on the premise of a reward system, which is necessary for its analysis. In reality, most users rate things for free. Its similar to giving someone a tip. That is my main hesitation with this analysis. Perhaps the economic literature on intrinsic worth would provide a different perspective on this.

 

Paper B

 

This paper first constructs a system that minimizes the budget for a truth-telling incentive-oriented feedback mechanism. The approximation methods they use seem to be computationally straightforward but time-consuming. They use Matlabs standard linear solver. I wonder if they might be able to achieve faster time by using a different scheme, though this is beside the point.

 

This paper also suggests the use of a filtering mechanism for rooting out false reports. Assuming we are in a state where most people tell the truth, then this mechanism should work well. If a lot of people reported falsely however, this might not hold. Additionally, this mechanism relied on a small number of trustworthy reports whereas in reality such a collection would not exist and if it did, we would not need to elicit feedback elsewhere. If truthful behavior is common, this would appear to be a good mechanism for combating bribes or fake buyers from giving feedback.

 

Sagar Mehta

 

These papers present an interesting approach to designing a scoring system that elicits truthful evaluations. I question the assumption that a complicated incentive mechanism is necessary to ensure honest/informative feedback in the first place, however. The motivations that the authors provide in the introduction to the Miller paper are underprovision (people have no incentive to spend time reporting) and honesty (desire to be nice or fear of retaliation may cause them to withhold negative feedback/conflicts of interest may result in distorted opinions). I agree there is some cost to provide feedback, but a simple reward mechanism for providing any feedback at all on products you buy could remedy that. I'm not convinced that in large part people do not report honestly (in the absence of ulterior motives such as creating a false-name and reporting about yourself). If 95% of people report honest feedback for a given product, do we really need a complicated scoring mechanism to get honest reporting from the other 5%? I wonder if there are empirical studies which explore specifically the question of "do people report honestly?" I believe that in general people will report honestly since they get utility from doing so and people generally value fairness. Given that any person looking to purchase a product will more likely look at an average of ratings, do we really need a mechanism to ensure honest reporting? Furthermore, given that the influence of any one reporter is diminished when the total number of reporters becomes large (i.e. one bad rating out of 1,000 will have very little impact on my good plumber's price), the scoring mechanism described seems useful only when the total number of reporters is small (assuming a large percentage of reporters don't have ulterior motives).

 

My second concern with the approach given is that it is impossible to truly measure the external benefit an individual may get from misreporting. Scaling of the scoring function is proposed as a mechanism to overwhelm individual's outside preferences. But this will end up increasing the necessary payouts to each individual thereby increasing the total cost of procuring information. It is also unclear how you can quantify external preferences, which should vary in a case by case basis.

 

I also wonder about the fact that there are multiple Nash Equilibria in the game. I think this would make for an interesting lab experiment to see which equilibrum players settle on. My intuition is that players will settle on fair reporting in the absence of ulterior motives. You can then introduce some players with ulterior motives who get utility from misreporting and see how this impacts the equilibrium.

 

Subhash Arja

 

Paper A

 

This paper considers a mechanism in which raters' inputs are fed into a scoring system, and the system will infer the posterior beliefs about another rater's report. It seeks to solve the issue of giving raters on sites such as eBay and Amazon an incentive to give honest feedback, rather than submit to peer pressure or not submit feedback out of lack of interest. I think the paper's motivation is very good since, in most systems, sellers and buyers have little incentive to give good ratings and can sometimes tend to overreact for mishandled transactions. I thought the discussion of handling manipulation was lacking in the paper. I would have liked to have seen a discussion focused on possibly preventing a group of participants from destroying another customer's reputation.

 

Paper B

 

This paper is an extension of the previous paper and addresses the same problem of giving incentives to raters to report honestly. However, the authors prove that their system can achieve a bound on the monetary incentive value for budget constrained applications. I felt that the paper was structured very well and had the right combination of real world applications along with a theoretical foundation. I also liked the results section since the authors were able to quantify the tradeoffs between cost and information loss.

 

Victor Chan

 

The two papers Eliciting Informative Feedback: The Peer Prediction Method and Minimum Payments that Reward Honest Reputation Feedback, by Miller et al, and Jurca and Faltings respectively, deal with online reputation mechanisms and how to get users to submit honest feedback. The first papers lays the theoretical foundation for eliciting honest feedback by using a proper scoring rule. The second paper deals with using automated mechanism design to computer the minimum budget required for payoffs in the feedback systems, when users give a rating.

The main contribution of the first paper was that it showed honest reporting was a Nash Equilibrium when the payments were directly proportional to the scores given by the central system. This score is determined by comparing the implied posteriors of the new report with another report for the same item (reference report). The paper then extends this idea to sequential and continuous cases. The authors also note that there exists other Nash Equilibrium where users only submit one type of review, ie h or l. In this case, it is fairly easy for users to collude together, since simply agreeing with the existing reports will be a Nash Eq. Finally the paper also talks about various implementation issues for such a feedback system, which include users that are risk averse, users with different tastes, and noncommon priors and private information. Even with these limitation, the paper is important because it has shown that it is possible to elicit honest feedback from users simply based on giving monetary incentives that is scored using a log scoring rule. The applications of this system can be used for any online review sites, since honest feedback is valuable in any situation.

 

The second paper is related to the first, since it shows exactly how to give out a minimum amount of money by the central/feedback entity. The papers main achievement is that it derives optimal payment schemes to find the minimal budget required when the margins to offset reporting and honesty are set. Alternatively it finds the max margin when budget constraints are set. The paper then shows how to solve these problems as linear optimization problems. The paper also discusses how to solve these problems with realistic inputs, and the results show that this type of payment scheme can be used in real worlds settings.

 

Both these papers present interesting new ways of getting user to submit honest feedbacks. However some short falls including failure to avoid collusion, and other manipulation techniques. Understandably, these issues exist in other online systems. One of the biggest problems I can foresee using these types of scoring methods, is again how to explain to users how their payoffs are achieved. Another problem is that there seems to be an requirement of a buy in, or subscription fee, to help offset the budget in payoffs. This is will not work on the internet, since many review sites are free, and if anything it is the viewer that is charged, not the reviewer. I would suspect that a far simpler system to just pay a few "experts" to review an item will work just as efficiently as having many peers reviewing.

 

Andrew Berry

 

Paper A

 

Miller, Resnick and Zeckhauser discuss a scoring system that make honest reporting a Nash Equilibrium. The analysis develops a method to gain honest feedback when objective outcomes are not available. Additionally, the scoring model is flexible enough to handle sequential evaluations and continuous signals. Overall, the results were very positive for this paper and the analysis was very thorough. Stochastic relevance was a new term to me and I am surprised that it does not come up more in the theoretical analyses we have read. I think future analysis needs to be done on the types of nontruthful equilibrium that can exist. The authors state that the truthful equilibrium is not unique and I am curious if there is anyway to guarantee that the truthful equilibrium is a stable state or that any of the other equilibrium are unstable. The paper also makes mention of the troubles that can arise with practical application and poses several potential solutions for each. Would solutions be feasible with a conjunction of those issues? I am also still unclear about the explanations surrounding the sequential equilibrium not unraveling and how the mechanism adapts to elicit private information.

 

Paper B

 

This paper uses automated mechanism design to compute optimal payments that given a certain budget constraint, maximize the margin. The mechanism loses in simplicity when compared to a proper scoring rule model, but gains in efficiency. In truth, this paper was too technical for me, but I wonder if the assumption that all buyers share a common belief regarding the prior is a legitimate assumption to make. I also would have liked to see more discussion about relaxing the assumption of risk neutrality. Can payments be raised high enough in a practical sense to make the model reasonable in application when dealing with risk aversion? I thought the example given in section 3 was helpful and similar high level explanations would have been helpful for understanding the model.

 

 

Ziyad Aljarboua

 

Paper A

 

This paper presents  scoring systems to elicit feedback through the internet. The systems address the two major issues with online feedback: underprovision and honesty through comparing feedback with peers.

 

In this scheme, rater's score is based on a comparison between the likelihood assigned to a rater's possible and actual ratings whose report probability distribution was updated by the rater's. Scores are then converted to monetary incentives for raters. Scores are only based on other raters' reports and no other information. Budget balance for organization utilizing such a scoring system is not an issue as the transfers occur between rater's and the net transfer from the organization is zero. However, such a situation might create a scoring system with no sufficient rewards making voluntary participation harder (addressed in 2nd paper).

 

As mentioned in the paper, some of the systems have some limitations. Risk averse raters might not provide truthful feedback in the scoring rule based system. Ways to address truthfulness for risk averse raters are discussed. Other limitations and issues with the scoring systems are discussed in the paper as well. The budget balance assumption is based on the fact that raters will start off with initial scores after which all transfers will accrue amongst rater's. However, the source of the initial scores are not discussed here. Assuming that these scores were initiated by the organization eliciting feedback, the budget balance assumption is not correct.

 

Paper B

 

This paper builds on the previous paper and introduces an automated mechanism to construct payments to raters to minimize budget allocated to reward users. This idea stems from the assumption that the best solution to address the problem of self interested agents is to have the reward offset the gain of not telling the truth.

 

The authors of this paper stress that the difficulty of submitting a feedback is one main reason that discourage people from rating a product or a service. However, the opposite situation in which the cost of submitting a feedback also raises concerns as it would drive more feedback from self interested agents who might not give their true feedback.

 

This paper address two issues: cost of feedback and gain of untruthful feedback through a payment scheme that explicitly rewards honest feedback to offset any gain. Optimal payment scheme is devised using linear optimization techniques. Furthermore, lower limit for the budget could be achieved by filtering out false reports.

 

I think a payment scheme based on offsetting dishonest feedback with larger reward is not efficient. Unless an upper bound is put for the reward, this system is impractical. And with no upper bound for reward, this scheme is not efficient. This scheme is only valid under the assumption that gain of dishonest feedback is minimal and can be offset by a larger reward without causing any budget imbalances. Also, the gain from a dishonest feedback is assumed to be known in order for such a payment scheme to work. Finally, some practical issues might arise from using such a scheme as the linear optimization computation cost might be high depending on number of users. However, It is noted that a payment scheme could be described by a closed function.

 

Xiaolu Yu

 

In the first paper, a side payment charged to the current buyer is paid to the subsequent buyer for her prediction of the future rating of a later buyer according to a scoring rule. This approach enables truthful reporting of clients to be a Nash equilibrium, though not a unique one. Another side payment mechanism is proposed in the second paper.

The first paper sketches a framework to incentivize honest feedback. By comparing a user's reviews to that of its peers, it shows the lack of objective outcomes. The basic idea behind the mechanism is to use the feedback of a future client to rate a submitted report. The payment for the present report, which is used to update a probability distribution for the report of the rater, is computed based on the likelihood of actual rating. The expected payment of a true report is always greater than the expected payment of a false report. This makes truthful reporting a Nash equilibrium.

 

However, it is difficult to take advantage of the extra control of a ranking algorithm in converting ratings to rankings. Furthermore, the approach doesn't address problems induced by malicious users who rarely care about their interests/losses.  Also, the impact of reviews on the outcome of the system is not explicitly explained.

 

The second paper designs an incentive compatible payment scheme using a computational approach. The key idea is that the payments are computed through solving an optimization problem that minimizes the total budget required to reward the reporters, instead of using some fixed scoring rules. When submitting feedback, clients get payments depending on both the value they reported and those by other clients, and this correlation can be used to design feedback payment that make honestly reporting a Nash equilibrium.

 

The paper also presents a way to further lower the budget by 1. using multiple rating reports and 2. applying a filtering technique for reports that are very likely to be false. They make the payment mechanisms cheaper and more feasible.

 

Haoqi Zhang

 

Paper A

 

The main contribution of the paper is in introducing proper scoring rule mechanism for eliciting honest feedback from users of a system where the users report their observed signal of the underlying type (e.g. quality of a product) and the payments are set such that reporting the observed signal maximizes the expected score. The authors only rely on the fact that signals are correlated, and any proper scoring rule will work. Since proper scoring rules are proper under monotonic transformations, constants can be chosen to ensure the satisfaction of participation constraints. The authors then present a number of extensions and discuss implications for actual implementation.

 

Paper B

 

The main contribution of the paper is in introducing proper scoring rule mechanism for eliciting honest feedback from users of a system where the users report their observed signal of the underlying type (e.g. quality of a product) and the payments are set such that reporting the observed signal maximizes the expected score. The authors only rely on the fact that signals are correlated, and any proper scoring rule will work. Since proper scoring rules are proper under monotonic transformations, constants can be chosen to ensure the satisfaction of participation constraints. The authors then present a number of extensions and discuss implications for actual implementation.

 

Rory Kulz

 

Paper A

 

A very interesting paper with a mechanism that elegantly solves a

problem, but I had two thoughts. First, the assumption that all of the

participants model product types and signals identically is quite

strong, and so it would be interesting to see what can happen when

that is weakened. But again this does seem an interesting "conceptual

road map" to start down on.

 

The second thought, I had, though, is one of practice: do we really

need this? In particular, do we really believe that Internet feedback

aggregators suffer from issues of underprovision and honesty? In

reality, I think we see the same sort of machinery that drives the

growth of Wikipedia -- just browse the comments on Yelp or Amazon. The

example of NetFlix is even poorer on the part of the authors, since

NetFlix provides a clear incentive to honestly rate films -- their

Cinematch system. That is, you will (theoretically) receive better

recommendations as the number of your ratings increases. Experience

with the Internet leads me to believe that issues with "gaming the

system" or bad outcomes are more likely to take the form of a

Sybil-like attack which we consider later in the week than due to this

sort of more casual or passive misaligned incentives.

 

Paper B

 

I don't have much to say on this paper; it's not all that interesting

-- extending the other paper on the "MRZ" mechanism to efficiently

minimize the budget required. My comments from my other email still

apply regarding realistic applicability.

 

Angela Ying

 

Paper A

 

This paper discusses ways to elicit honest feedback from users for a given product.  In particular, the paper proposes a system where users predict the reports of other users, kind of a like a prediction market, and are rewarded based on how good their prediction is. The paper assumes that the users receive a private signal, and make their predictions without knowing the predictions of other users, with all information revealed at the end. After detailing this system, the author then discusses extensions to this system, such as "coarse" ratings (such as 5 stars), continuous signals, and different types of users. The author also addresses some potential complications to the system, such as having prior relevant information, different tastes, and multidimensional signals. I thought this paper was very thorough in thinking about extensions and practical issues with this system.

 

I questioned a couple of assumptions made in this paper. First of all, it seems that the Nash equilibrium of everyone reporting their signal is highly unstable - if, for example, there was a coalition of people who all falsely reported their signals, the Nash Equilibrium would not hold anymore (perhaps it would converge to another nash equilibrium where everyone reports falsely). For the asynchronous rating section, it seems like like the first user has a much smaller incentive to report any signal at all, since the probability of him getting a good return is much smaller than the probability that a user at the end, who already knows the results, gets a good return. For future work, I would be interesting in seeing the effects of this system if the incentive system were taken away.  Rating systems on sites such as Amazon do not incentivize people to report their signals, but people still rate the products because they want others to use (or not use) the product. I wonder how accurate the results from this type of system is, compared to this. Is this actually better?

 

Paper B

 

This paper discusses an automated system to calculate the necessary minimum payment for users to give honest feedback about a product, and modifications to decrease this minimum payment.  The author models the system as a linear programming problem subject to the constraints that the user must get a higher expected return for giving honest feedback than lying (and higher return than no feedback at all). He discusses the complexity of the problem and concludes that the process may have to be approximated. In addition, the paper then discusses says to lower the minimum payment, including using multiple reference raters (rather than 1) and weeding out bad feedback by looking for outliers. He proposes a novel idea of weeding out bad feedback - namely, it depends on how useful lying is. If the incentive for lying is not high, then the algorithm will be more lax about accepting outliers.

 

I thought this paper was very clear and presented an interesting way to solve this problem. The assumptions made are fairly simplistic and assume uniformity amongst all the users. I think it would be interesting to examine the "goodness" of these results, as compared to systems where the users are not offered any monetary incentive for rating the product. Finally, I think it would be interesting to look at different types of feedback a user can give - often, I find that reading people's comments rather than ratings are a lot more useful. Is it possible to design a system to look at these comments?

 

Brett Harrison

 

Paper A

 

The first paper gives a theoretical mechanism for eliciting honest feedback in a reputation system, e.g. where a user can rate a product online. With this mechanism, reporting the user's actual rating based on his opinion (signal) of the product is a nash equilibrium, i.e. is an optimal strategy given that everyone else reports truthfully. This is indeed a neat result, but I think it's a little far from reality. I would like to see someone establish a clear direction for how to implement a mechanism such as this in a real life application, and test whether or not humans really react with the equilibrium strategy of reporting the truth. But more importantly, this mechanism doesn't seem to be "affordable" in real life, especially if the rewards given for honest feedback are monetary. The rewards could become way too high for any sustainable reputation system.

 

Paper B

 

This paper has a similar flavor to the first paper, but instead provides a mechanism that guarantees some upper bound on the amount gained from deviating from truthful reporting, determined by the amount of reward offered. One of the ideas I really like about this paper is the idea of bringing in reference raters. I think this can be a very realistic and easy to implement addition to any current online reputation, and it would add to the quality of that system, since some measure of accountability could be introduced into the reputation system. For example, you might publically announce that if a reference rater catches someone with a false report, they will receive huge negative consequences, such as being banned from that website or losing lots of points, etc. This way, people would be averse to lying.

 

Avner May

 

These papers dealt with the question of how to elicit honest evaluation of a productÕs quality from users who have already ÒexperiencedÓ the product.  The first laid out how to use proper scoring rules in order to do this, and have honest reporting be a Nash equilibrium, whereas the second solved how to do this is as cheaply as possible (by solving a linear program).  The ÒpaymentsÓ imagined in these articles could be monetary payments, discounts, or points/status of some sort.

 

These are interesting papers to have alongside eBayÕs ÒreputationÓ system, which seems to rank a sellerÕs reliability more than it ranks product quality.  However, IÕm not sure how useful in practice a system like this actually is.  The scoring system seems like it would be hard for an average user to understand quickly; imagining a user either consciously or subconsciously realizing that honest reporting is Nash seems a bit optimistic, although this analysis does provide some security against attacks to the system.  I think that in practice it would probably be easier to, in a system like eBay, merely ask for feedback on product quality (maybe 1-5, maybe -1 to 1), in addition to overall experience with seller.  In eBay we saw that many people, out of courtesy, participate in the feedback system, and thus I see no reason to believe that they wouldnÕt answer 1 or 2 more questions regarding theyÕre experience.

 

Alice Gao

 

The main contribution of the paper "Eliciting Informative Feedback: The Peer-Prediction Method" is to propose a method for eliciting honest feedback about the quality of some product/service.  The proposed mechanism assigns scores to one rater based up a comparison between the likelihood assigned to the reference rater's possible ratings and the reference rater's actual rating.  The key insight in getting the result is to compare the implied posteriors to the report of a reference rater.  I think this is an important and novel use of proper scoring rules for reputation systems. 

 

The second paper "Minimum Payments that Reward Honest Reputation Feedback" builds on the first paper and formulate the payment problem as a linear optimization problem.  As a result, the result of the first paper can be regarded as a special solution for the optimization problem, and the second paper also found other general solutions to the problem based on budget constraints and other factors. 

 

Comparing the two papers, I found the first one to be quite limited in the sense that it only focused on using scoring rules.  The formulation of the problem in the second paper is very insightful in presenting the general form of the problem.  Also, the first paper mentions many issues on using the concept in practical application, but it doesn't go into enough detail in any of them.  I feel that it would be more interesting for it to focus on one or two and discuss the issue more thoroughly.  Also, it would be useful if we can find other special solutions other than the scoring rule solution.  So it would be worthwhile to continue examining the optimization problem in search for other useful special solutions.

 

Travis May

 

In ÒEliciting Information Feedback: The Peer-Prediction MethodÓ and ÒMinimum Payments that Reward Honest Reputation Feedback,Ó the papers propose interesting solutions for the problem of incentivizing users to provide honest feedback.  Anecdotally, this certainly is a problem; I have recommended people on LinkedIn because there is a social incentive to do so, and I rarely respond to online reviews unless I had a polar (great or horrible) experience with a particular product/seller.

 

The papersÕ solution merges this portion of the course with the scoring papers that we read previously.  The methods proposed attempt to determine a ÒtrueÓ quality (by pooling collective information about a product), and establish a scoring rule in which the maximum compensation is provided when the user answers honestly.  This is a reasonable approach, but it suffers due to risk aversion, in which a user is not seeking to maximize his expected payoff.  While the papers address potential solutions to this problem, these are not satisfactory in the context of bounded rationality.  In this (quite probable) scenario, a user does not understand or pay attention to a complex payoff formula.  Instead, what a user sees is how large his own past payoff has been under previous situations.  Most likely, if he gave a score that deviated from the prior expectation, he ended up not being compensated at all, while if he gave the expectation score, he received at least some payoff.  Thus, from his limited experience, he believes that it punishes him to be honest if honesty deviates from likelihood.

 

My proposed solution to this problem would be to take the calculation out of the userÕs hand altogether.  Instead of a reward system that encourages honesty, honesty ÒscoresÓ could be assigned to raters.  In most cases, a user assigning a score is a repeat user of a particular marketplace.  My proposal is to weight scores by the revealed honesty of the user.  By applying the same methodology that these papers propose to the usersÕ scores, we can glean the expected honesty across various portions of the marketplace, and increase a userÕs credibility as they provide more and more accurate ratings.  The solution thus does not strongly change the incentives for users, but it increases the accuracy of reviews seen in a forum where most (but not all) users are honest.

 

Zhenming Liu

 

Paper A

 

This paper studied the design of appropriate scoring rules that encourage the participants to report their belief truthfully. The proposition 1 and proposition 2 look like the most important results of this paper. It is interesting to see these results also connect the scoring rules studied earlier this semester. The ÒextensionsÓ section nevertheless contains a few interesting results. For example, the scoring rule can naturally extend to the sequential interaction case.

 

It is (again) slightly disturbing to deal with the section Òissues in practical applicationÓ. Besides the discussion of risk aversion subsection, it seems that most topics discussed in this section are trying to answer imaginary questions. It is not clear to me what type of imaginary questions would be important given that implementing the mechanisms discussed in this paper is not practical.

 

Paper B

 

It is not surprised to see follow-up works trying to minimize the cost when implementing the mechanism designed in the first paper. The linear programming model makes this problem be trackable while some discussion on the computational complexity of LP in section 3.3 sounds quite strange. For example, it is already well known that worst case complexity is usually not a good metric for an LP algorithm (e.g., simplex is worst case exponential while it usually is faster than other polynomial algorithms).

 

On the other hand, in this mechanism, many computational tasks do not sound trackable already. For example, it seems difficult for an agent to estimate a second agentÕs signal after seeing its own one. Further justification shall be needed to convince us (me?) why knowing cost optimization problem is trackable is important.