Student Comments – November 5, 2008

 

Haoqi Zhang

 

Ranking Systems: the PageRank axioms

The main contribution of this paper is the complete characterization of the set of axioms satisfied by pageRank and showing that under these axioms, pageRank is the only possible ranking algorithm. Their result is significant because it shows that any algorithm satisfying the axioms is pageRank, and furthermore, the axioms require only ordinal relations and are fairly reasonable, which provides a justification for both pageRank and vice versa pageRank for the underlying axioms.

 

In reading the paper, I did nevertheless find the justifications to be roundabout -- in that the authors argue that the axioms are reasonable because of pageRank and pageRank is reasonable because Google uses it. I think one interesting direction for future work is to look at the conditions under which pageRank performs poorly (or well) and to see which of the axioms correspond or contribute to the observed behavior. This may lead us to better understand how these properties relate to what we consider to be 'good' search rankings.

 

----

 

Popularity, Novelty and Attention

The main contribution of this paper is in introducing a simple, expressive growth model of user click behavior on a webpage as it relates to the position, popularity, and novelty of articles on the page. The authors' then take this model and fit data from digg.com to estimate the values of the various parameters of the model, and show that despite the multiplicative form of the function combining position and novelty, the factors can be separated. With the goal of maximizing the total number of clicks discounted over time (what the authors call attention), the authors show that there is a sharp transition based on the decay to whether position based on novelty versus popularity.

 

In a sense we can view this paper as alternative forms of ranking results based on the outcome we wish to achieve and the environmental conditions (e.g. decay rate) present in our setting. I found the results to be quite interesting as it relates the strategies directly to the outcome we seek --- I think models of this type can really influence goal-oriented design in a variety of internet settings.

 

I also wonder if we can use the results of this paper to not only choose which strategy to use but to think about how to influence the various parameters of the system and how these changes (even small ones) can lead to significant differences in the total number of clicks (and we may choose a different strategy based on the new parameter values).

 

Subhash Arja

 

In the paper "Ranking Systems: The PageRank Axioms" by A. Altman, et al., the authors merely show and illustrate graph theoretic axioms that the PageRank system satisfies. The major conclusion in the paper is that any page ranking algorithm that subsequently satisfies these axioms must coincide with PageRank. This paper was completely theory based with very little descriptions to applications. I found it interesting that they associated PageRank with social choice, since I had only considered it purely from a graph theoretic point of view. The proofs were done well, but I thought that the paper in general could have used more examples. I did appreciate the examples they provided for for the three major axioms in Figure 1. There is no "Conclusion" section in the paper since the authors set out to prove certain axioms. I felt that the discussion section was lacking in the area of future work. The authors mention that this paper is the initial study in the topic of page ranking systems. However, I would have liked to see some hint or inclination towards some concrete applications.

 

Unlike the first paper, the second paper b F. Wu, et al., is very applicable to real world scenarios and forms its whole argument around study of digg.com. I always find papers that conduct these sort of tests because of the implications the results have for many companies who seek to maximize their profits. These results would be very useful not only for information aggregation sites like Digg but also news sites and for advertisers. For example, if CNN knew what the best way to order their articles to gain more viewership, they can figure out a profit-maximizing assignment to advertisers. If they found that novelty was the major reason people read an article, they would charge advertisers more money to have ads instantly show up along with breaking news stories. This is one area that can be explored as future work based on the concepts and methodology laid out in this paper. However, the paper limits itself to considering only 3 strategies and it is possible that the results are skewed based on the frequent users of Digg. For instance, Digg users tend to be more tech savvy and liberal-leaning based on the stories that frequently appear on the top ten list. An interesting study could be done on a more impartial information aggregation site.

 

Rory Kulz

 

Ranking Systems: The PageRank Axioms

 

I came into computer science through pure mathematics, so this is the

sort of paper that really appeals to me. PageRank by itself is a great

idea and seems out-of-the-blue, but to show that there's a natural

reason to have considered a great idea, now that's a *great* idea.

 

Anyway, very cool, I'm anxious to hear about follow-up work. I

wondered about possibilities of weakening the axioms. In the last

proof, the deletion properties and the edge duplication property

seemed maybe more fundamental than the more natural committee,

collapsing, and proxy properties. What if we take those as axioms?

Like in Euclidean geometry, Playfair's axiom versus the parallel

postulate.

 

This also raises the question whether PageRank varieties that involve

damping factors or incorporate time data to boost the PageRank of

newer pages (so that they are not overwhelmingly disadvantages by the

number of links) can also be thought of as the canonical ranking

algorithm under a set of axioms.

 

Popularity, Novelty and Attention

 

Pretty boring. I had trouble seeing how this fits in with the rest of

the voting stuff, exactly, but I guess it's a good idea: attempt to

model diggs based on novelty and popularity and figure out how to

display stories in order to maximize essentially people getting to see

what they will want to see.

 

I don't really buy a lot of this, though. If stock prices (which I

would think are less random or chaotic than say memes and Internet

culture) are hard to model with stochastic equations and the like, why

should diggs fit? Even assuming that and considering the optimal

ordering problem, they look at such specific cases that I really fail

to see how this fits into a broader theoretical framework. It's also

not clear that attention is the thing you want to maximize; maybe if

popular stories dominate the front page of digg for long periods of

time, people won't visit digg as frequently because they are seeking

novel content.

 

Zhenming Liu

 

Ranking Systems: The PageRank Axioms.

This paper tries to link the PageRank algorithm with social choice theory by showing that assuming a few axioms of a ranking algorithm, PageRank is the unique algorithm. It is indeed an exciting results despite the fact that page rank has been studied for a long while.

 

There are an array of questions that remains unanswered in this paper though. My first immediate question is which axiom Kleinberg’s page ranking algorithm violates. Kleinberg’s algorithm is another important algorithm that ranks pages based on the link structure. This paper’s result is suggesting Kleinberg’s algorithm is violating some axioms. It would be interesting to further investigate Kleinberg’s algorithm in the axiom system proposed in this paper.

 

Second, the “naturalness” of the proposed axiom system needs more justification.  It is self-evident that the axioms should hold for any ranking algorithm, nevertheless, these axioms do not seem to be critical to a ranking algorithm. It is not clear whether there exists other ranking algorithm that sacrifices one or two axioms proposed in the paper.

 

Also, the social network is arguably a dynamics system while the page rank was designed to rank static web pages. It is not clear how the page rank works (and how the axioms are re-formulated) if the connectivity in a social network system changes frequently.  Along the same line of thought, there could be many types of connections between two entities in a social network (e.g., positive, negative, strong tie, weak tie). The page rank algorithm treats the connections among pages uniformly. It is not clear how we can close the gap between the real world and the page rank model.

 

Popularity, Novelty, and Attention

This paper studied the roles that popularity and novelty play in attracting the users’ attention. The authors extended an existing novelty model by adding one more “popularity” parameter, which depends on the position of the story/link. They then do a regression on real data to obtain the estimated parameter. By using these estimated parameters, a simulation system is built and different algorithms are tested using the simulation.

 

This paper did not put much effort to justify the model. And I cannot see the reason of establishing the relation $N_{t + 1} = N_t(1 + a_ir_tX_t)$ in their model. It also looks strange to me that they can only test their algorithm in a simulated environment.  They claimed that it is usually hard to estimate the algorithms’ performance without simulation, on the other hand, I believe finding (approximate) optimal algorithm is possible rather than comparing a few simple strategies.

 

Xiaolu Yu

 

Ranking Systems: The PageRank Axioms

Popularity, Novelty and Attention

The importance of the paper is it initiated work on this topic by introducing representation theorem for PageRank, bridging the gap between page ranking algorithms and the mathematical theory of social choice. Page ranking is the fundamental for search technology. Motivated by the open major challenge of tackling the descriptive approach in the new internet context, where the set of voters and the set of alternatives coincide, this paper introduced a representation theorem for PageRank. Any page ranking algorithm that satisfies isomorphism, self edge, vote by committee, collapsing, and proxy is proved to be the PageRand ranking system. Furthermore, two ranking systems that have the weak deletion, strong deletion, and edge duplication properties, and satisfies the self edge and isomorphism axioms, are the same ranking system.

The main contribution of the second paper is suggesting a principled way of choosing what to prioritize in order to maximize attention when designing dynamic websites. Depending on the rate of decay of novelty, prioritizing novelty and emphasizing popularity can be chosen to deploy with great care in measuring decay rate of novelty, because the switch between the two benchmark strategies switches so sharply around a critical value, which resembles phase transition observed in the real world.

One question I have after reading this paper is about the third strategy. I do not see (or maybe I overlooked something) where in the article clearly proves that this strategy is indeed myopic and generates the most clicks only in the next few minutes. According to Figure 4 and discussion correspondingly, I understand O3 in general is the best strategy, and by assigning different weights to novelty and popularity yields an even better strategy, the paper does not provide a convincing proof to the myopic characteristic of O3. If a great amount of efforts are needed to measure the decay rate of novelty, why bother to do it since we also have to be prepared for possible reverse results if we just make a minor mistake?

A second question is we are now aware of the second strategy prioritizing popularity performs poorly for digg.com case. It is also discussed that if novelty decays too slow, strategy two should be preferred to strategy one. I keep wondering what possible applications for strategy two are in the real world, because as the article explained in section three: new stories can never find their way to the front page since all the old stories have more clicks.

 

Victor Chan

Papers: Ranking Systems: The PageRank Axioms and Popularity, Novelty and Attention

 

The two papers Ranking Systems: The PageRank Axioms and Popularity, Novelty and Attention examines the internet/webpages and the way they are ranked and preferred. The PageRank paper shows rankings of pages on the internet based on the links they have to each other. The PageRank algorithm is described as a social choice or preference ordering of pages based on the likelihood of reaching a page through links. The Popularity, Novelty, and Attention paper by Wu and Huberman, deals with how to generate the most clicks on a single webpage, based on the popularity or novelty of the links, where popularity is described as the number clicks it has already had, and novelty is described as how much new/fresh the link is.

The main contribution of the first paper, is to describe a set of axioms for a ranking system that includes isomorphism, self edge, vote by committee, collapsing and proxy. The paper then shows that PageRank satisfies all these axioms and that any other page ranking algorithms will coincided withe PageRank. The paper treats the internet as a graph where the links are the edges between the vertices (webpages). It would be interesting to apply different weights to these links/edges since a back link from wikipedia might be more valuable than a back link from a personal website. If this were true, I would say some of the axioms cannot hold. For example collapsing or proxy have notions of removing or adding pages in between links without affecting the rankings of pages before or after.

The second paper talks about how to maximize the number of clicks of links on a dynamic webpage. The paper examines three types of strategy of placing links, one based on popularity (number of existing clicks), one based on novelty (how new the link is) and one based on the likelihood of generating additional clicks in the next time period. The results were obtained from simulations and it showed that based on the decay factor of novelty, one should pick either the popularity scheme or the novelty scheme. It is interesting to note that the switch between these two strategies occurs abruptly at a value of novelty factor decay. The paper also suggests that the third strategy will work best in most cases, since it is a mix of both popularity and novelty, and doesn't only bet on one factor. I found that since this analysis was performed on a website such as digg.com there are certain limitations. For example, on digg, users try to find news bits and read them, however relevancy is not key. A search engine such as google, would benefit more from listing search results based on popularity, rather than novelty, no matter what the decay factor was.

Both papers presented interesting insights into how content is indexed on the internet. A possible project idea would be to see how both these indexing schemes can be manipulated. There is a lot of people doing SEO on google, and it would be interesting to see if PageRank can be improved to be unaffected by some of the techniques that exist. (I'm sure Google is already work on this though....)

 

Angela Ying

 

First paper: Popularity, Novelty, and Attention

 

I thought this was a very interesting paper. It discussed methods of ranking pages, focusing on digg, a website where users can post links, and other users can express that they like the link by "digg"ing it. Specifically, it discussed the different effects of popularity, which is either the number of clicks or the number of diggs, and novelty, which is expressed as a decaying function that starts at 1. The paper also showed from digg data that being in the first position versus second position makes a difference in diggs. The author developed 3 models - 1 that ranked in favor of more popular links, 1 that ranked in favor of more novel links, and 1 that ranked in favor of the most "replicated" story, which was a balance between popularity and novelty. As expected, the most popular story model failed because they lost their novelty rate, yet new sites would never be able to show up in the top 15.  However, a really interesting result that the author came up with was the phase diagram, indicating that the decay rate was a big factor in how effective the remaining two models were.

 

I thought that a weakness of this paper was some of the assumptions made. Particularly in the simulations, it seems that keeping only 15 pages is not realistic at all, and can hurt the popularity model. It seems that there will be people who browse pages other than the home page on digg, which means that they may find newer sites that are very entertaining, and increase its popularity levels. To make the popularity model even more effective, perhaps a new feature that we could explore is the possibility of undigging something - i.e. if a news article has gotten too old users can vote down its popularity to remove it from the front page.

 

Second paper: Ranking Systems: The PageRank Axioms

This paper discussed the fundamentals of ranking systems, most notably PageRank. The main contribution of the paper was linking Internet ranking systems with graph theory, representing the Internet as a strongly connected graph of webpages, which are vertices, with edges represented as links between webpages. This paper listed the properties of isomorphism, self edge, vote by committee, and collapsing as axioms essential for such a ranking system. Furthermore, it proves that after satisfying these axioms, the ranking system is preserved even after deletion of vertices. Finally, it stated that PageRank satisfied all these axioms and that any page-ranking system must do the same. I thought this was an interesting paper, although overall I was slightly disappointed because I wasn't sure about what the main point of the paper was. It seems like it set up a basis for general page rank systems, but it did not prove many theorems that would reveal interesting properties of PageRank.

 

For future work, I think it would be interesting to examine some complexity results for PageRank - how long does it take to come up with a good ranking, given a graph? Furthermore, I would like to see how well PageRank actually works and how often it updates its graph of webpages, given that pages are constantly being created and taken down.

 

Alice Gao

 

Wu and Huberman presented their analyzed the effects of different strategies on maximizing attention to certain types of dynamic webpages.  Their main result stated that webpage managers should either order the stories by novelty or popularity based on several different factors.  The results seemed pretty intuitive to me.  However, I feel that the discussion and approach used in this paper does not fit exactly into the framework/terminology we have been using in the social choice theory.  I suppose I could take designing website as a mechanism which aims at showing the users their most preferred stories.  One question I have about this paper is that, it implicitly assumes that users' preferences are based on criteria such as novelty and popularity.  In reality, users browse internet for many different purposes, and it is quite possible for them to have different kinds of goals/preferences.  So I think this paper only captured a particular type of user preference for this scenario.

 

Altman and Tennenholtz formulated axioms which completely and uniquely characterize the PageRank algorithm.  A possible project idea would be to consider several axioms and perhaps try to relax the constraint or make them more restrictive.  After all, I think the ultimate goal of building theories related to any practical application is to build better applications using the theory formulated.  So it would be useful to consider how these theories could help us to perhaps design better algorithms for ranking internet webpages.

 

Nikhil Srivastava

 

The Altman and Tennenholtz paper provides an interesting characterization of the PageRank algorithm by a minimal and fairly intuitive set of axioms it satisfies. The result was certainly interesting from the "descriptive perspective" put forth in the paper in its introduction of a representation theorem, but I found the lemmas and proof to be fairly algorithmic and provide little insight into the workings of this particular voting rule. (I think the linear algebra perspective is the most intuitive). What might have been more useful is a discussion of PageRank in terms of the previous issues we've been discussing such as false-name-proof-ness (issues with link farms, etc), manipulability, and responsiveness.

 

The Wu and Huberman paper was a useful presentation of an interesting application of aggregating social information - maximizing attention to websites, and it contained some useful empirical results. As a frequent browser of social news websites I found this paper quite engaging, though it seems to have a very specific focus on the ordering of links on a website. I think useful results might be obtained for more general classes of problems that require aggregating user's choices to structure any arena in which users are presented with arrays of information: restaurant choices, real estate listings, etc.

 

It might also be useful to combine these results with situations in which a website designer *knows something* about the individual users. Facebook comes to mind: how might that company re-organize its "News Feed" based on metrics of popularity, novelty, and user-specific information to maximize attention? Personally, I'm much more likely to click on a picture of a friend than some news item about his or her organizations or relationship status.

 

Sagar Mehta

 

Popularity, Novelty, and Attention

 

This paper could have ramifications regarding how websites prioritize their information content. The authors analyzed the role of popularity and novelty (and positioning) in attracting viewers to content on a dynamic website. The authors ideas are formalized into a mathematical model, and they conclude that whether someone should emphasize novelty or popularity will depend on the rate of "decay" of novelty. While the goal of this paper was to think about how to optimize click rate – I wonder how the voting scheme used by digg ties into the papers we read on anonymity-proofness and voter manipulation. Do the articles on the front page of digg.com truly reflect the popularity of it? What if I have 100 accounts and digg my new articles as soon as I post them? Furthermore, if newer articles are given preference, can we characterize how many people act on the incentive to re-post the same article multiple times?

 

 

Ranking Systems: The PageRank Axioms

 

The main result of this paper is that any page ranking algorithm which satisfies a certain set of axioms must coincide with PageRank. PageRank is an interesting application of information aggregation – except here the "information" or votes comes from each website (through links to other website) rather than actual voters. It is fundamentally gauge of what is important to users of the internet. The paper takes the descriptive perspective towards attaining their axioms, but I'd be interested in the reverse direction. Obviously, our set of requirements for a social aggregation rule will differ based on what our end goal is. But if our goal is gaining insight into the relative popularity or importance of certain websites, what rules should it satisfy? Are there rules which satisfy these requirements?

 

Avner May

 

Ranking Systems: the PageRank Axioms, by Alon Altman and Moshe Tennenholtz

Popularity, Novelty, and Attention, by Fang Wu and Bernardo A. Huberman

I found these papers very interesting and enjoyable to read. 

 

The first paper I thought did a wonderful job breaking the PageRank algorithm down into its core axioms, which they successfully showed fully determine the ranking system.  I believe this is very valuable in the sense that it truly allows one to analyze the merits of the algorithm based on the properties it satisfies.  This allows for the possibility of designing new ranking algorithms which satisfy some, but not all of these properties, while hopefully satisfying some other desirable properties.  It seems like the most reasonable way to go about designing new ranking systems, or of arguing for the merits of the PageRank system itself.  On another note, I thought that framing the question of page rankings as one of social choice, where each link is a signal of preference, was a very interesting approach.

The second paper investigated a very specific question -- optimal ordering of news stories on Digg.com.  As suggested in the article, this question is definitely one that can be generalized to other cases where a site consolidates content from other sites, and is interested in how many clicks out of the site are achieved.  The question is intellectually stimulating in my opinion – how do you order links to maximize clicks?  This paper only analyzed the position of the link as the variable to optimize by, not at all investigating questions of text size, images, video content, or other attention grabbing strategies.  Even though the question was very specific, I think this paper did an unbelievable job in analyzing it in its fullest detail, and found some striking results.  I would not have foreseen the extremely sharp phase transition demonstrated in this article.  I think it was great how they used a functional form for clicks based on actual click data from digg.com, and thus in my opinion provided a useful result.  I doubt Digg.com fully understood why (or whether) their link ordering was optimal, and this paper definitely explains this in full.

 

Ziyad Aljarboua

 

Ranking Systems: The PageRank Axioms:

 

This paper discusses the PageRank algorithm used to rank pages. This paper

discusses the rationale of using a particular page ranking system. To do so,

the PageRank axioms are introduced. Hence, any page ranking algorithm that

satisfy those axioms coincides with PageRank.

 

In the PageRank algorithm, rank of web pages is based on other agents' input

(e.g. number of webpages that link to the page). This is a way to make use of

the aggregate individual ranking of webpages. So in a sense, the task of

ranking pages is a social aggregation. In the PageRank algorithm, pages are

assigned probabilities (probability you will get to that page when starting

from a random page) based on the links.

 

The main contribution of this paper is the set of axioms that are used to

evaluate page ranking algorithms. The axioms represents rules that ranking

algorithms must follow to coincide with the PageRank algorithm. The first 3

axioms are intuitive. The first axiom requires that the page rank to be

independent on the name of the vertices of the graph representing the agents.

The second one makes a distention between the link originating from a webpage

to itself and other links. The third axiom requires that page rank should be

independent on way the page links to other pages. The fourth axiom requires

that in case of collapsing, the page that was not drooped should have a rank

equivalent to its original rank plus the rank of the dropped page. The final

proxy axiom requires that dropping a page that acts as a midpoint between two

sets of page should not affect the rank of the pages.

 

 

This paper does not discuss some concerns that can potentially affect the

ranking system. It does not take into account manipulation ,both on the group

and individual level, aimed at positively changing the rank of a webpage. The

2nd axiom touches on this issues by distinguishing between link originating

from a page to itself and other links. What about links from other pages that

do not really represent social aggregation?

------------------------------------------------------------

 

Popularity, Novelty and Attention:

 

This paper discusses how popularity and novelty of a website can attract users.

Three stratgies are considered here to maximize attraction. In each strategy,

popularity or novelty is given the highest priority. In the third strategy,

prioritization of links based on expected future interest is examined. A

conclusion is reached by comparing the performance of all three strategies.

This paper shows that the first two strategies are more effective than the

third one in most cases. The selection of 1st or 2nd strategy is dependent on

the novelty of the wepage and rate of its decay.

 

In linking the popularity and novelty of page (or a story in case of digg.com),

a new factor is suggested to relate the number of clicks each link gets

relative to the position of the link in the page. Links on top of page are

expected to have more traffic that those at the bottom.

 

In general, this paper does a good job on examining these three strategies.

However, This paper looks into a specific model, the one implemented by

digg.com. How can the findings of this paper be generalized to other models?

Also, how about other strategies that i think can be more important that

popularity and novelty. For an example, one strategy that could attract

attention to a link could be the relevance. For an example, a user from Asia

might not be interested in reading about the US elections where as some on from

the US is more likely to read the story. So a strategy to attract more users is

increasing relevance by personalizing stories based on location or past

interest.

 

Also, this paper implicitly assumes that popularity of a page is independent of

its novelty. I think there is a direct correlation between novelty and

popularity. Pages are popular because they are novel.

 

Finally, how would changing variables in the model used to reach the conclusion

alter the results. For an example, how would changing the step time from 5

minutes to 10 minutes or changing the average new story arrival time changes

the findings? It was observed that changing some of these parameters would

affect performance of strategies. How would those findings be implemented in

highly dynamic website where average new story arrival time change frequently?

 

Hao-Yuh Su

 

1. Ranking Systems: The PageRank Axioms

2. Popularity, Novelty and Attention

 

The 1st paper connects the page ranking algorithms, PageRank, and the mathematical theory of

social choice by graph-theoretic ordinal axioms. In the beginning, it derives the axioms of PageRank,

and further proves that PageRank is the only algorithm that satisfies all of those axioms. About this paper,

I have some questions. The first two are about the PageRank algorithm. In this paper, it is said that the

process starts in a random page. My question is that how this random page is generated, and won't it

lead to different results if we start from different random pages? The second question is about the equation: .

It seems to be a key equation in PageRank algorithm. What is the intuition to derive this equation? My third

question is: what is the difference between and ? In my last question, I want to make a connection between

the previous paper and this one. In previous paper, it talked about casting multiple votes in a voting. When we

link the page ranking with social choice, should we consider this aspect as well? In real world, one single user

 may repeat clicking on the same link several times, and, therefore, affect the page ranking. I am not sure of this,

 but should we try to exclude this case when computing the page rank algorithm?

 

The 2nd paper investigates the three strategies used to maximize attention on website. Firstly, it shows that

the position of the story does affect its popularity. When deriving this conclusion, the author made an assumption

 that the novelty affect and the position affect can be represented by two independent factors. That is, stories in

different positions hold the same novelty factor. I think the assumption is over simplified. As the authors said, it

 needs to be tested empirically. But how should we approach it? I agree the expected growth rate is the product

 of the two factors. When the novelty factor is a function of both time and position, how can we tackle it? My

second question is about the second strategy. In the 2nd strategy, the stories are sorted according to their

popularity. However, won't it overlook the possibility that some story in later position may be the most popular

 one if it is posted in the first place?

 

Malvika Rao

 

I found the paper "Ranking Systems: The PageRank Axioms" to be

particularly interesting. The paper attempts to establish a foundation or

existential theory behind internet algorithms such as ranking algorithms.

Previously I had considered such algorithms as mere heuristics. And it

appeared that work in this area focused on building more heuristics on top

of each other to get better performance. Rather than adding layers of

complexity for better performance, this paper goes in the other direction.

It tries to simplify.

By tying ranking to social choce (a very apt analogy) the paper tries to

get at the essence of what ranking really is.

 

It would be interesting to extend this type of analysis to other ranking

algorithms. What further parallels can we draw with social choice? Is

ranking manipulable? Can we characterize the results given by different

ranking systems into classes of some sort? If not ranking then are there

other "systems" that we could apply social choice theory to?

 

Brian Young

 

Both these papers arrived at interesting results about page ranking that left me wondering about manipulability.

 

I was impressed with the insight at the root of the Altman and Tennenholtz paper, that page ranking could be seen as a limited form of social choice. Altman and Tennenholtz define some desirable qualities in a ranking system, then demonstrate that PageRank is the unique ranking system that satisfies all these properties.

 

This makes me wonder, again, how resilient PageRank (or at least the graph-theoretic version of the algorithm) is to manipulation. Much as we have seen with other rules that seem to achieve a desirable set of properties, PageRank seems susceptible to some form of strategic behavior, in this case designed to drive up the rank of a particular page. Obvious examples include adding cycles among pages, whether legitimately (through some form of link-trading with the owner of another page) or dishonestly (by creating new pages whose only purpose is to link and receive links). Which of these properties would we have to relax in order to prevent or address such manipulation?

 

(One idea we have seen before is to introduce some kind of cost or penalty, such as by decreasing the credibility we assign to pages as they spread farther from some source. But this would seem to make all the properties found by the authors no longer tenable.)

 

Wu and Huberman consider the respective roles of novelty and popularity in ranking links on news-aggregation websites, specifically digg.com. The paper models three different strategies, one placing the newest articles on top, one placing the most popular articles on top, and one using a greedy algorithm to make a selection. I thought it was interesting that the authors did not appear to try other combinations of novelty and popularity besides simple multiplication.

 

From a usability standpoint, there is a clear benefit in there being a simple and predictable ordering, such as novelty or popularity. Given the paper's results (confirming straightforward intuition), what is popular tends to be static over a longer period of time than does what is new (i.e. the things that are new change faster than the things that are popular), so perhaps it's a mistake to compare them in this way; I'd think that the common idiom (which Digg presumably uses, although I've never browsed the site) of listing all the incoming articles sorted by newness while keeping a sidebar of most popular articles would make more sense than what the paper proposes.

 

I think some kind of user study might be appropriate as well to determine what users' relative priorities are in visiting such sites.

 

Travis May

 

These papers present methods to rank page results on a search engine given social choice theory.  While “Popularity, Novelty, and Attention” focuses on click-related metrics, “Ranking Systems: The PageRank Axioms” focuses on Google’s PageRank methodology, which recursively defines popularity based on the quality-weighted popularity of sites that link to it.

 

Intriguingly, this set of axioms can be applied to a number of other fields, instead of just focusing on the internet.  Any self-contained network with one-directional relationships can have this methodology applies, and the social choice axioms still apply.  For example, in measuring the academic output of professors, a PageRank-style algorithm could be used to create a refined quality index.  Popularity could be defined by applying a PageRank scheme to sites like Facebook.  Once PageRank is established as a set of social choice axioms, there are a number of other ways in which that methodology can be used.

 

Peter Blair

 

Popularity, Novelty and Attention

 

In this paper the authors examine the role of popularity and novelty as well as an additional strategy that focuses on "hot" stories that will result in more clicks in the immediate future. The central result of this paper is there is a dramatic phase transition in the ranking scheme where the novelty strategy overtakes the popularity strategy as a preferred method of page ranking. This research can be enhanced by considering the types of websites, which should also play a role in which strategy will be best suited  for a given website and also consider where along it's development cycle the website is. Take for example facebook, after it's initially lauch at Harvard University, visits to facebook were intially driven by the novelty of the website. However, as facebook became more widely used, it became easier to stay in touch with friends via facebook since most people were on facebook. Here we see an example of a website that grows based on novelty in it's intial period and based on popularity, or wide useages, as it matures.                            

 

Andrew Berry

 

Ranking Systems: The PageRank Axioms

 

The Altman, Tennenholtz paper discusses PageRank, a page ranking algorithm. The most interesting part of the paper was the overall approach was reducing the page ranking problem on the Internet to that of finding the limit probability distribution of a strongly connected graph. I found the explanations of the axioms clear, and, in particular, the axiom the dealing with collapsing was well done. The idea of collapsing while keeping the relative rankings of the pages seems very similar to some of the lemmas proven in the voting protocol papers that we read earlier this week. One thing I was left wondering after the paper is if it is possible consider other axiomizations other than graph-theoretic ordinal axioms in regards to PageRank. Additionally, how different would the axiomization for other page ranking algorithms be? Where would the fundamental differences lie in the Hubs and Authorities procedure mentioned?

 

Popularity, Novelty and Attention

 

The Wu, Huberman paper attempts to maximize attention on websites by using strategies that tradeoff novelty and popularity. I think this paper and the approach it takes is interesting. I think the equations that represented the growth of the clicks to website were well thought out. In particular, I though equation (2) and (3) which also take the page position of a given story were accurate models. My main problems with this paper regard actually measuring the parameters. The paper never truly defines novelty versus popularity! Additionally, in regard to the third strategy in which we predict future popular stores, is the one-step greedy strategy appropriate? Is a two or threes step greedy approach intractable and if not, would this provide better performance results? Also, the results seem highly dependent on this decay factor for novelty. In application, how does one measure a novelty decay rate? The optimal strategy to maximize attention relies heavily on this parameter and the authors indicate that this rate must be "measured with great care." Yet, this concept is so abstract and perhaps this work can only help maximize attention for stories at the extremes (those who lose novelty quickly vs. those who lose novelty slowly).

 

Nicholas Wells

 

Ranking Systems: The PageRank Axioms

 

This paper presents an axiomatization of PageRank, seeking to move it as a concept into the realm social choice theory.

 

The authors present a series of axioms that characterize PageRank and draw out their proof of completeness. They also show that no other system satisfies the set of axioms presented.

 

The aim of this paper seems to be very theoretical in that it is introducing PageRank, a popular internet ranking scheme, into formal theory. The proofs presented are rather technical and further discussion of them and placing them in a context would be useful

 

Popularity, Novelty and Attention

 

This paper looks at how popularity and novelty attract attention by analyzing three strategies in placing stories on a website (such as digg.com). They show that popularity and novelty - oriented strategies should be chosen depending on the rate of novelty decay while an expectation strategy does not work well.

 

The authors use a simulation to try their theory. They model new storis as a Poisson distribution with a new story on average every 20 minutes.

I would like to better understand how they test performance. They find, however, that decreasing novelty is a good model if novelty decays rapidly while decreasing popularity is a good model if novelty does not decay fast enough.

 

This paper seems to only explore these three models, I would be interested in seeing what someone could come up with by exploring additional models including perhaps a hybrid model of both popularity and novelty ranking.