Student comments November 19, 2008

Paper A: Peekaboom

Paper B: CAPTCHA

Haoqi Zhang

Paper A

The main contribution of the paper is in presenting the system of Peekaboom, a game in which one person reveals parts of an image to another that relates to a given word while another tries to guess the word. The game is useful in that the revealing process shows how the word relates to the image, e.g. where a leg of a cow is in the image. I found the incentive structure to be well-designed to promote participation, e.g. by alternating roles of the players, using a bonus round, and the use of pings.

Some comments:

- For some words that are general nouns describing an object, it would seem that you want the players to mark the whole cow, but instead players will get more points just by marking the face under which the other player will know that it is a cow without a bounding box for the cow generated.

- One interesting thing is that cheating is in many senses no longer fun. Purely malignant attempts aside, it seems that no cheater will stay in the system of a long time.

- It seems strange that a hint is giving positive points for --- perhaps it should be part of the interface directly that allows people to more easily access (vs. give hints if necessary), kind of like how in charades people

always try to signal how many words there is.

- Is there a tradeoff between the usefulness of the outcome to the fun factor of the game? It seems that getting users to play is the most important factor for success. For example, one

can think about having a training tutorial in which players start with simpler images to work them up so that they aren't immediately discouraged by their lack of initial success. The loss here is that

during training the player isn't contributing too much to the system.

- Which games do people play more often? Why?

Paper B

The main contribution of the paper is the analysis and formalizing of construction of CAPTCHAs based on hard AI problems, whereas humans can easily generate the correct output but machines cannot (and if they could, a hard AI problem would be solved, which is good as well). CAPTCHAs are useful for a variety of domains in which one wishes to verify that a human agent is interacting with a system instead of a bot. One of the key assumptions of CAPTCHA is that the problem instances with solutions are easy to generate themselves, which enables on the spot generation of problems instead of a dictionary of problems which had the adversary had the problem becomes trivial (which is to say that an adversary is assumed to know the algorithm underlying the CAPTCHA but not the randomness it generates).

While CAPTCHAs are quite useful, it is not clear to me just how large the class of problems that are very hard for computers but very easy for humans are. In the image recognition case, wouldn't machine learning algorithms catch up by taking sample instances generated by human outputs to generalize patterns of recognition? Does this success suggest solving a hard AI problem? I guess I didn't quite understand the connection fully here.

A point about serialization: one interesting tweak applicable to CAPTCHA is that there are situations where humans can come up with a close to correct solution (e.g. one letter off in recognizing a word in CAPTCHA) whereas it is difficult for a computer to recognize it with a small error. In these cases, we can tune the ACCEPT conditions of the CAPTCHA such that it takes humans less time but doesn't help computers. One possibilities is increasing the number of letters in a CAPTCHA but allowing for some mistakes, which humans may be able to do very well whereas it is harder for a computer to have to deal with more letters.

Andrew Berry

Paper A

This paper introduces Peekaboom, a web based game that is also an effective means of gaining training data for computer vision algorithms. After reading “Designing Games with a Purpose” which claimed that challenge is a big way to ensure that users will continue to play a GWAP, I question the effectiveness of giving additional points for using hints in Peekaboom. Not only is this counterintuitive, but I do not quite understand how this gives additional information about the relationship between the word and the image. Similar to my critique of other GWAPs with automated players, I am unconvinced that emulating Peek with an automated player will yield good, unbiased training data. I also would have liked to see the usage statistics compared to a similar game. After all, each user played about 18 games of Peekaboom in the span of a month. Is this a large number? Even when the paper stated that every player in the top scores list played 800 games I was skeptical because the size of the top score list was not given.

The anti-cheating mechanisms to ensure the purity of data are very well thought out for Peekaboom. I worried that only words in the dictionary would be considered valid responses in the game. However, seeing that the interface notifies the players of misspelled words is reassuring. I do wonder if there are enough images for which slang words or abbreviations could be tags. These could not be captured effectively by Peekaboom. But, this may be an insignificant subset of the images which we desire to test. One thing that was not addressed is how varying the pixel radius could affect the responses of Peekaboom players. This would be an interesting extension of this preliminary work.

Paper B

This paper presents the theoretical framework behind CAPTCHA. In this paper, hard AI problems are distinguished as those that humans can solve, but current computers cannot. The most interesting concept in this paper was gap amplification. Being able to use any AI problem in which there is a gap between computer and human ability as a security parameter gives seemingly infinite flexibility. Coupled with this flexibility, as long as there are hard AI problems, there will be application security. Although there is no foresight needed as in Cryptography if the AI community is correct in believing that all hard AI problems will eventually be solved, does Cryptography provide the best long term security techniques or is there a similar belief held in that field? Yet, the paper does not impose a time limit on how long the human must solve the problem. The paper also discusses the MATCHA family and states that it is a slightly impractical CAPTCHA. The reason why is not apparent to me. A computer program can be guaranteed to have success in this problem type with probability ½, but this paper states that this is normally unacceptable and this probability could be lowered over repeated games. Would repeated games be too computationally difficult?

Nikhil Srivastava

In the first paper, von Ahn contributes yet another GWAP - Peekaboom - that gathers user-generated information about objects' spatial location in images to assist computer vision algorithms. It is a second-generation ESP game, which only tagged images with metadata that pertained to it in an arbitrary way; Peekaboom awards points both for providing semantic hints and for guessing a label with limited visible region. This game sounds incredibly fun to play (I wish it were still online - is it?), and by its nature I think it provides more useful information and seems more resistant to obvious cheating strategies. I thing introducing more flexibility and user decisions into gameplay (whether to ping or reveal, whether to give a hint) can increase enjoyability and information usefulness, though I wonder when complexity might frustrate users.

In the second paper, the concept of a CAPTCHA is presented as a method of distinguishing humans from computers with hard AI problems and is evaluated formally in terms of its ability to be solved by humans and computers. Two families of AI problems are identified that can be used to construct CAPTCHAs, and their solutions are shown to solve steganographic communication problems. Some of the formal proofs were tricky for me to understand, mostly because I had trouble seeing the motivation behind different notations and the like. Personally, I sometimes have great difficulty reading CAPTCHAs, and I'm pretty sure I'm a human. I'd be interested in knowing what progress has been made in this area since the paper came out.

Sagar Mehta

Paper A

I was impressed by Peekaboom and think it provides a much more robust set of data for computer vision algorithms. Interestingly, the initial set of data from the ESP game was the basis for the new game, and I wonder if the same approach can be extended to other GWAPs – such as verbosity. In playing the game, people often think of the easiest clues rather than filling in the actual clues. Sometimes the clues a player gives may fit in one of the categories, but in the interest of speed they may simply plug it into the top slot, ignoring the relation between the word and the clue. So, using the data from verbosity, it may be possible to create a new game where the players are asked to describe the relation between a word and clue data, and are rewarded for matching (as in the ESPP game). For instance, if I was describing "rice" in Verbosity, I may type "grain", then "food", then "lice" all in the "it is" field. A more accurate description of rice however would be "it is a type of grain", "it is a type of food", and "it rhymes with lice". Creating a second game where users take clues and try to match relations could help improve the usefulness of Verbosity's dataset.

Paper B

This paper introduces two families of AI problems that can be used to construct CAPTCHAs. I kind of got bogged down in the definitions of P1 and P2 and would like to go over concrete examples of each in class. I thought the remark the authors make about Gap Amplification is interesting. Even if we have a computer that is 80% successful at solving some CAPTCHA, presenting the CAPTCHA n times will reduce the probability of passing all of them significantly. This is an important design consideration in designing systems secured by CAPTCHAs – rather than make the CAPTCHA harder for both human and machine, it might make more sense to present a series of CAPTCHAs (though too many will result in frustration for the user).

Hao-Yuh Su

Paper A

This paper gives a thorough introduction of Peekaboom:

it talks about how it works, the data collected, the associated

applications and its evaluation. However, it seems that reading

this paper is the only measure to understand Peekaboom now,

since this game has been removed from gwap.com. Why?

According to this paper, Peekaboom has a good design that

attracts continuous participation; they are provided with well-

rounded anti-cheating mechanisms, and they have acquired

desirable data values. Everything looks fine. Nonetheless, I

have some thoughts about this game. First, the author claims

that one of the information Peekaboom collects is how words

relate to the image. The word might be a noun, or a verb or

others. Things might be straightforward when the word is a

noun. But how well does this program perform when it gives

a verb? Is the peek-and-boom mechanism proper to the picture

related to a certain movement? How well can people tell the

corresponding movement when given a static image, especially

when the image is related to some trivial movement like standing

or sitting? Not to speak the increasing difficulties when players

are only given partial pixels of it. However, perhaps this problem

is not important to the designers, since the purpose of the game

is to "locate objects in images." If so, there shouldn't be any verb

in Peekaboom. Another question is how to integrate it with image

search engines such as Google? Is it practical? Or, there already

exists such collaboration?

Paper B

This paper shows that CAPTCHAS can be applied in cryptography

and posses advantages superior to traditional methods. The author

has proved that it is hard for a computer to solve CAPTCHA in

general cases. I agree this point of view. Still, I have one question.

In the first paper, it talks about training computer vision algorithms.

I am wondering if it is possible to train computers to solve CAPTCHA

problems using similar method. That it, if it is possible for an

eavesdropper to acquire a sufficient amount of data (CAPTCHAs

and the corresponding inputs) and develop a corresponding AI

algorithm to solve such problems?

Alice Gao

Paper A

I enjoyed reading the paper on Peekaboom a lot because this paper goes into much detail in describing the design of Peekaboom and justifying that how the game design ensures accuracy of outputs to some extent. It also described many data processing methods in order to filter out data that might have been polluted by players who are trying to game the system. Peekaboom, as a game, definitely has a much more richer design than the other games that we have seen on the gwap.com website, and therefore it allows us to collect much more data that could potentially be useful for different purposes.

I think at this stage for these games, it would be interesting to take similar designs in each game and compare with each other. So far, all these games have just been trying certain designs at random, and there is no systematic way of approaching the design problem. So an interesting project would be think about a few similar design options for a particular purpose, and then try to compare their relative effectiveness by using both theoretical and empirical analyses. This will help us to build a more systematic framework for the design issues for these types of games. So the main idea is that: it's not good enough to say that your design is good, what you should really try to do is to make claims such as "my design is better than all these other designs because of (1), (2), (3), ...".

Paper B

The first thing that kind of surprised me is that CAPTCHA does not only refer to those distorted text. It actually refers to a much more general concept. The paper refers to it as a program which can generate and grade tests which are easy for humans to do, but currently impossible for computer programs to solve. The second thing that interests me is that the issues discussed in this paper draw from both cryptography and artificial intelligence, which is a connection that we don't see very often. The theories presented in this paper also surprised me a bit since I didn't expect that it is necessary to describe a intuitive concept with so many theoretical notations. One possible future work that interests me is to identify possible approaches for computer programs to solve hard AI problems like CAPTCHA. Of course, this is going to be a hard problem for a while, but it would be interesting to at least come up with a couple of ideas for approaching the problem first.

Rory Kulz

Paper A

It was cool to see this game, since it's not available on gwap.com. I

thought it was a good point that GWAP are not actually a way to solve

problems by harnessing collective human computing power (although they mention it briefly in applications and might, say, be what Google

Image Labeler is actually doing occasionally, perhaps on images it has

preclassified as hard to discern) but rather a good way to solve the

problem of making databases of good training examples, a nice thought. It was amazing to see just how much data was generated.

I was also curious about how some of the image area problems were

handled (and still am a little bit with Squigl), so this paper was

pretty useful in that regard. It's nice to a see a simple approach

work.

On a complete tangent, regarding writing, this is perhaps one of the

first times I've seen an "Ethical Considerations" section in a

computer science paper. How common is this for experimental papers in CS?

Paper B

I had read about CAPTCHAs before and possibly the authors' softer

paper on the topic, and I've done some playing around with breaking

them, actually, using pattern recognition (I was momentarily inspired

out of necessity and http://caca.zoy.org/wiki/PWNtcha back a few years ago). I really like this formalism for hardness (although the paper is maybe a little too in love with notation), and the idea for

steganography is very cool.

My one issue with this paper is more in the writing -- that the

analogy with cryptography and number theory only goes so far: while we don't know how hard factoring is, the question is presumably

answerable. One day, with sufficient mathematics, one can imagine some provably optimal determinstic classical factoring algorithm. While

difficulty in implementation (i.e. the size of the keys to choose) may

be tied to asymptotics, there is a concrete value for its asymptotic

difficulty. Here, the theory is predicated precisely upon estimates

(human estimates) in all cases, as there is no notion of intellectual

complexity as there is computational complexity. So it's not quite as

grounded in mathematical formalism as one would truly hope to say deep things.

Really, this is a way to produce (and prove) useful estimates for

hardness for one AI problem from estimates of hardness for another AI

problem, regardless of how they were obtained, which is fine, although

it's not necessarily billed that way. Anyway, a fun paper.

Peter Blair

Paper A

In this paper the authors decribe Peekaboom, an AI training game in which users help to identify objects in a picture. In this game, one player, called "Boom" has a full picture and the associated word. By clicking on parts of the pictures (that are relevant to the word), the first player reveals portions of the picture to the second player "Peek" who then guesses the word that Boom. Peekaboom also has a function called pinging, which allows the first player to hone in on a part of the object in the image, as is the case with the example of the elephant picture with trunk as the word. The authors have very reasonable ideas about manipulation and cheating in the game. By awarding additional points for players using hints, Peeaboom garners higher order information about the object. In our previous class discussion, we talked about rare words and ways of encouraging more sophisticated participation by players. In the case of ESP, it may be worthwhile to follow peakaboom's example and enable players to match on more than one word as a means of eliciting higher order information. With respect to cheating, the authors note that there is little incentive to cheat and besides have implemented strategies to counteract cheating. This notion of discounting cheating and rewarding hints leads me to think about studying postive manipulations in a game -- manipulations that lead to desirable out come, i.e. outcomes that are consitent with the objective sf the game -- i.e. not all manupulation is bad. A case from a previous work that we read was online voting -- if the objective is to drive traffic to a website, then an easily manipuable site is good because it makes it easier and more alluring for voters heavily invested in the outcome of the vote to visit the site many times, which is the goal -- driving traffic to the website. The elimination of poor image-word pairs is also a neat design feature! As is the designation of "gangter" for players earking 250k to 1million points (I had thought to trade mark this word two years ago, gangster = good). The comment from a user is also illuminating, it turns out that peakaboom and similiar styled games provide the same rush as gambling, without the risk of losing one's earthly possession, so there is even a postive social cost. Perhaps peakaboom can have a sweepstakes where it gives away $1million to a luck player, that way the game will really be like gambling with out the risk (huge rewards no risk -- great incentive structure for getting more people to play games and idetify pictures). It would be interesting to see how one would model this game. The model that Shaili presents in her paper would be well-suited to modelling peaak-a-boom, there will still be a notion of a dictionary for the overall image but in addition to this there may have to be sub-dictionaries that corerspond to pixilated portions of the photograph. One potential challenge would be the fact that mice have very high pointing precision so there is a very large number of ways to selection 20 pixel portions of the image so one would have to think of a way to intrroduce a more corse wasy of pixelating the photograph so that by shifting a few mm to the left or right corresponds to selecting the same region of 20 pixels, ie.e there are a finite and reasonable amount of 20 pioxel slots to slect with mouse (payble we can divide the picture into a puzzle or a moasiac)!

Paper B

In this paper the authors discuss CAPTCHAs, distorted text that must be deciphered in order for users to access an online service e.g. online voting email etc. The idea is that humans are better at this type of image recognition that machines so it would be hard for one to manipulate this process by writing a script that votes multiple times for example. The authors develop a frame work for characterixing CAPTCHAs that help with our understanding of what it means for a CAPTCHA to be hard to solve. Other interesting ideas include having multiple captchas, which is a recent development that I have seen when logging into some online services. Multiple captchas are effective when computer algorithims may be good but not perfect in deciphering the distored words. As I was reading this paper, particularly section 5, I wondered why we don't just use ESP instead of captcha's. For example, before I log into my email I must play a game where I match on an image with another player, or with a computer generated record of the word matching game for a given image. This approach has two great benefits (i) since the issue of image recognition is a hard problem for AI, it will be very difficult for someone to write a script that can crack this, since if they could write such a script then they would have solved the AI problem that ESP is trying to solve with image recognition. Secondly, ESP wins by getting another labe on its image and everyone is happy. The End :).

Michael Aubourg

As a matter of fact, we need Captchas. However, for many reasons Captcha systems create a serious accessibility barrier. Indeed they require the user to be able to see and understand shapes that may be very distorted and difficult to read. A Captcha is therefore difficult or impossible for people who are blind or partially sighted, or have a cognitive disability such as dyslexia, to translate into the plain text box.

And of course there can be no plain-text equivalent for such an image, because that alternative would be readable by machines and therefore undermine the original purpose. Since users with these disabilities are unable to perform critical tasks, such as creating accounts or making purchases, the Captcha system can clearly be seen to fail this group.

Such a system is also crackable. A Captcha can be understood by suitably sophisticated scanning and character recognition software, such as that employed by postal systems the world over to recognize handwritten zip or postal codes (This system us neural networks). Or images can be aggregated and fed to a human, who can manually process thousands of such images in a day to create a database of known images -- which can then be easily identified. Or even worse, as I said in my project proposal, people can combine Captcha attacks with GWAP. On the one hand, the mal-intentioned people trained their algorithm thanks to thousand of people playing. On the other hand, they make it work, to break Captcha. This is a big drawback of GWAP.

Recent high-profile cases of bots cracking the Captcha system on Windows Live Hotmail and Gmail have emphasized the issue, as spammers created thousands of bogus accounts and flooded the systems with junk. Recently,a security firm Websense have reported that the Windows Live Captcha can be cracked in as little as 60 seconds (!)

Some people did project on Captcha-cracking. The project PWNtcha ("Pretend We're Not a Turing Computer but a Human Antagonist"), reports success rates between 49% and 100% at cracking some of the most popular systems, including 88% for that employed by PayPal which is very impressive.

Thus, the growth and proliferation of Captcha systems should be taken less as evidence of their success than as evidence of the human propensity to be comforted by things that provide a false sense of security. It it also a game of mice and cats, since both sides are going to progress, and then the opposite one will catch up, and so on.

Finally, algorithm can use humans to solve the puzzles. One approach involves relaying the puzzles to a group of human operators who can solve Captchas. In this scheme, a computer fills out a form and as soon as it reach a Captcha, it send it to a human being. However, if human beings are involved, the proccesses are going to be much slower than if only computers where involved.

As a conclusion, the Captchas seem they will never be a real barrier .

Victor Chan

Paper A

The peekaboom paper presented a GWAP that allow players to generate data for locating objects inside of images. The main contribution of the paper was talking about peekaboom's overall design, usage, and results. This paper I find is interesting because peekaboom's design seems directly tied in with the ESP game, since the authors mention that it utilizes labels generated by the ESP game. Further more, the results of peekaboom seem more valuable to machine learning, since it provide training data for actual image recognition. The author's again talk about the design choices such as single player recorded games, anti-cheating methods, and the importance of the incentive factors. The main insight of the results however shows that the bounded boxes generated by peekaboom over an object is 75% accurate compared to a human volunteer of boxing the same object. This is further supported by the 100% accuracy of the pings, to be inside objects. These results validates peekaboom as capable of generating accurate results which can be used for image recognition training.

I find that peekaboom is more fun than the other GWAP's since it appears to be a different game than the others, which are really just variations on charades, or other traditional games. Peekaboom is more interactive, and has incentives for players to reveal as little of the object as possible in order to maximize points. This works better than the ESP game, since users tend to just use a huge amount of low effort, high frequency words. Perhaps ESP can be seen as a datamine, and peekaboom, is the filter that actually grabs out useful data.

The one thing that was unclear, was if the user reveals only very small parts of an object, and the correct guess was made without revealing the entire object. Then would this result be useful training data, since the machine learning algorithm would likely benefit more from seeing the entire object, ie the entire car, rather than maybe just a tire.

A project idea based on peekaboom would be to see how it deals with pictures that have two of the key word objects. For example, if the pictures shows two cars, would the user only reveal one car so that it maximizes the points?

Paper B

The main contribution of this paper was to establish the theory behind CAPTCHA's and their use in security and AI. One idea of the paper is to show that CAPTCHA's are easy to solve for humans and hard to solve for computers, therefore are useful in fighting off brute force attacks. The paper goes through the mathematical foundation behind defining this problem for humans and for computers.

It also presents two families of captchas which are based on different types of AI problems. Another point mentioned in the paper is that regarding gap amplification, where the likelihood of a computer solving a CAPTCHA is high, but this can be reduced by asking it to solve multiple questions. I found this to be a good deterent for attacks, however it would be annoying to the normal user.

The paper's results can be used in a wide range of applications. It will be possible to find any AI problem that falls in the two families and transform them into CAPTCHA's for security purposes. Music recognition, image recognition, etc could all work. However what is unclear to me would be how to generate the data for these captchas in large quantities, it is understandable that transforming text can be done by a computer. However, using music recognition, would require a human to tag it in the first place. Perhaps this is where GWAP's can be used?

Xiaolu Yu

If we expect to collecting knowledge from volunteers, we must create ways to motivate them to contribute high-quality data. Ahn von Ahn and his group started to build interactive games which serve the dual purposes of acquiring knowledge and providing entertainment to motivate users. Notable such efforts include one paper discussing in the lecture the : Peekaboom, a game designed for segmenting objects in images , and another similar one ESP Game for annotating images.

Web-based annotation tools like Peekaboom and Captcha provide a new way of building large annotated databases by relying on the collaborative effort of a large population of users. The Internet game Peekaboom is invented to use "bored human intelligence" to label large image datasets with object, material, and geometry labels. As one of the players in Peekaboom, we have already contributed millions of labeled objects. While location information is provided for a large number of images, often only small distinct regions are labeled and not entire object outlines.

There are a couple of potential weaknesses of CAPTCHA. For example, if human solvers are paid to classify each photo in a monkey/elephant database as either a monkey or an elephant, almost the entire database of photos can be deciphered for a relatively small cost, if the salary per person is very low. A related potential danger would be minor changes to images each time may not be able to prevent a computer from recognizing the images as for one, image comparator functions that are insensitive to many simple image distortions could be helpful and for another, similar application as Peekaboom would facilitate recognizing same pixels. Furthermore, we all have such experience that sometimes an image warped enough to fool a computer is also troublesome for us.

Another potential problem is that only a yes/no answer for each picture required by most designs allows guessing right answers by bots. Furthermore, bots would accumulate knowledge to progressively improve the accuracy of their guesses over time.

At the end of the second paper, the authors mentioned that a program has been developed, with an 80% chance of success in passing the test. I have been thinking what would happen if a punishment on wrong response is introduced into the verifying process. Most of human's failures are caused by carelessness (they can choose to let the application generate a new image if they cannot identify the object); if punishment is introduced, human beings, aware of they could lose something, will undoubtedly maintain a high success rate than computer programs. My point is although it is encouraging to see this in the progress of AI, it could still be a solvable problem for some specific cryptograph problems.

Ziyad Aljarboua

Paper A

This paper discusses Peekaboom, a type of games with purpose that train algorithms locate objects in images with the help of people who play the game for entertainment. This paper address the lack of enough information to train such algorithms.

Information collected from players in traditional games about each picture helps algorithms identify objects in images, however, it does not help algorithms identify location of objects within the image. Peekaboom collects information about the location of objects within the image making the vision algorithms more efficient. Peekboom makes it possible to locate objects within the image by gradually revealing the pictures while playing.

While the structure of this game might seem influenced by the players' decisions, accurate information about images are obtained by combining outcome of several games of different players. This process produces information that is less susceptible to individual variance.

While it might seem like a good idea to try to collect as much information as possible about images, i think over describing an image might not be useful in some cases. If all animal images are broken down to body parts (tail, eye ... etc), any search of tail would return all those images. Also, i wonder about the importance of knowing the location of objects within the image. How would knowing the location of objects within the picture helps improve search results?

Paper B

CAPTCHA is a program that helps rank tests that only human can pass and machines cannot with today's capability. This paper presents a way to show how hard it is for a program to pass a test that is designed to block any non-human activity. It produces the probability of a pogrom to succeed in a given test. This paper describes ways in which hard AI problems that computers fail to solve can be used for security purposes.

Travis May

Since the premise behind the Peekaboom game is very similar to the human computation premise discussed on Monday, I will instead focus on the second paper, which introduces CAPTCHA – a tool that has grown profusely in importance on the internet since this paper was written. The core premise is to create a process that a human can easily solve while it is difficult (if not impossible) for a computer to solve the problem. By doing this, it ensures that humans are actually using a particular website, rather than bots.

While this does eliminate some automation in processes, it does not eliminate it in cases where there is substantial value for the spammer. The problem with this methodology is precisely that it is so easy for humans to solve the problem. Thus, if there is value attached to the process, it can be cheaply circumvented. A friend of mine has interacted with someone who runs a “forum marketing” business, where the entire business is to find message boards and blogs across the internet and post spam advertisements through the use of an automated script. What does the script do when he encounters a CAPTCHA? It feeds all of the CAPTCHAs it interacts with to a center in India where the answers are being provided by humans before it resumes.

While this slows down the process slightly, it’s actually fairly cheap to circumvent. Imagine that all-inclusive salary + overhead is $10/hour. A typical employee could hand one CAPTCHA every 10 seconds, or 360/hour. Thus, the cost per CAPTCHA is less than 3 cents to circumvent. This creates a nuisance that reduces margins for the spammer, but it does not get rid of him typically.

Of course, any problem that increases this cost would likely serve as more of a nuisance for the typical, non-spamming user, creating a trade-off for making an optimal system that prevents most spam while keeps most users.

Zhenming Liu

Paper B

This is a quite old problem (and this paper is also quite old) and I think CAPTCHA has developed a lot in recent years. I interned in a MMORPG company and briefly worked with a team being responsible for CAPTCHA. I am actually quite convinced that most imaged-based CAPTCHA (e.g., those used in Yahoo or Google) is attackable if hackers have strong incentive to attack the system (e.g., if they can earn real money). What still surprised me is that on average it takes the hackers 2 or 3 days to successfully attack any new CAPTCHA system we developed. Perhaps what’s more realistic is how the developers can react to the hackers efficiently and design new CAPTCHA in a timely manner.

More recent ideas in designing new CAPTCHA look more entertaining (though they are more remote to computer science). One example is to ask the clients to tell apart the jokes from other daily news. Another example is to show the clients a few pictures of men and women and ask the client to identify the pretty/ugly ones.

There also exists many other ways that successfully attack CAPTCHA without designing domain specific AI program. For example (which I think is quite famous), one can collaborate with porn site and redirect the CAPTCHA question to the porn site and ask the visitors to answer the question before they continue to use the site’s service.

Malvika Rao

Peekaboom seems to be a fun game to play. The calculation of bounding

boxes is particularly clever. I would be interested in knowing more about the bot that plays when there are an odd number of players in the system. How is it designed?

CAPTCHAs are designed to distinguish humans from computer bots. Yet there seem to be 2 competing streams of research. On the one hand GWAPs train machines in cognitive tasks that are easy for humans but hard for machines to execute. On the other hand security CAPTCHAs are designed to differentiate humans by posing problems that computer programs would not be able to solve. How long before these 2 streams of research intersect? The paper states that it is a win-win situation: either we are able to differentiate humans from computers or computers start to solve hard AI problems. But if the latter event becomes a reality then how do we implement reliable online systems such as online voting, reputation, peer production systems? Or maybe we can classify with high guarantee some set of tasks that computers can never perform. This might propel us to investigate deeper into what it is that makes human cognition uniquely identifiable.

Brian Young

Both Wednesday's papers deal with problems that humans are able to solve but that artificial intelligences have not yet learned to solve. This relationship is clearer in the use of reCAPTCHA technology, as is now fairly common. reCAPTCHA, like other forms of CAPTCHA, presents users with the task of identifying distorted text that computers have had difficulty understanding. It is unique, though, in that it also harnesses the power of human computation to solve useful problems. The words presented to users come from actual texts that have been difficult to interpret using optical character recognition; by entering the words, users help to digitize and process the text.

My first idle thought was to combine reCAPTCHA with something like the game Squigl, in which players try to trace relevant parts of an image. Though I know very little about optical character recognition, it seems likely to me that computers would find it even more useful to know where each letter is, and more specifically which strokes comprise the letters.

However, imagining such a system in place leads me to be hardly enthusiastic about this brainstorm. CAPTCHA is a security feature, and as such, even people who recognize its usefulness find it a pain to deal with, as anecdotal data (i.e. every time I use CAPTCHA) would suggest. The fact that a security feature is required implies that the user is trying to do something else that is important enough to require security. Although in a different context, tracing the images might be fun, incorporating it into reCAPTCHA (and requiring a certain level of proficiency) would not be terribly popular, I surmise, much as even someone who enjoyed playing basketball might not be so big a fan of a door that required him or her to sink five shots before unlocking.

In my comments from Monday (11-17), I wondered whether the "with-a-purpose"-ness of the games provided an incentive to play. The authors of the Peekaboom paper answer that knowing that playing the game helps solve some problem in artificial intelligence does indeed add to players' enjoyment. Again, people don't generally take any enjoyment from CAPTCHAs, but does their knowledge that reCAPTCHAs are useful and not entirely meaningless make them dislike them less, if you can navigate through my pronouns?

Avner May

I thought that both of these papers were incredibly important. The peekaboom paper takes ESP to the next level, providing data which could very feasibly be used to train computer vision algorithms. Whereas ESP gave info that could be very useful for an image search, Peekaboom uses the ESP results to extract the location knowledge necessary for the machine learning algorithms. I am surprised I do not already see the results of Peekaboom (at least overtly) in google image search, or something along those lines; I was very impressed by the object-bounding "boxes" and by the ping pointers. I would be very interested to see whether this data set has already proven useful for machine learning; have computers been able to beat previous performances due to this better training data set? How useful is this data in reality? How much of an obstacle is the lack of training data for machine vision algorithms, vs. other obstacles for this problem?

I thought that the Captcha article was quite revolutionary. It seems to be introducing the birth of a field -- the applications of hard AI problems to security. It provides a theoretical foundation for the grounds on which you can use an AI problem as the center of a security scheme. The fruits of this work are already seen all over the internet, as captchas are a very common way to detect and block out unwanted bot activity. I am interested in hearing what other "captcha"-like technologies are currenlty being used as well (other AI based security schemes...).

One thing I was wondering after reading this was whether peekaboom could be used to break "captchas". If you fed captcha images (the distorted words) to peekaboom, and asked players to identify the letters (just like they would normally identify other image elements), could this eventually produce enough training data to break captcha?