Monday, 1 November 2010

Crowd Source Luddite, should you give free work to Google.

Notice when Facebook gives you this two words to type in to confirm you are human? This is called a Captcha. When you have two words like this you have a reCaptcha. In a two word reCaptcha only the first word is really a test. The second word is being used by Google to crowd source converting digitally scanned books in to text format. The idea is that if someone gets the first term correct, they will also get the second word correct. You might have noticed that the second term is always an actual word, though the first one is often just a meaningless combination of symbols. Once the second term I got was in Russian, which might happen rarely to people.

Most sites that use reCaptcha have a link and tell you clearly that your "cognitive surplus" is being used by Google to convert digital books in to text. But Facebook does not seem to think this is worth telling you. You can "elect" to not take part in Google's efforts by typing in anytihing for the second word. In the case above the item was shared with Facebook fine.

Though I think the idea of reCaptcha's are way cool I have two issues. First I don't see why Facebook does not bother to tell you that you are doing free work for Google. On a deeper level I am not sure that Google will use this free work I provide for the good. If the plan is only to convert scanned books in to digital books so they can be shared with the world that is great. Give me all the reCaptchas I can do. But I personally am a bit troubled with the possibility that Google might try to sell what I am doing. In that way Google will be getting work from everyone and then charging them for the benefits of the work they did. I think that is called slavery.

Though I will continue to fill my reCaptcha's out correctly I would like to see Facebook be more public about this, and get a more firm assurance for Google that they will not seek to profit from the work done. And if they do sell the work people have a right to some of the profit generated from their own work. Google does this with adSense and YouTube revenue sharing.

Now if you are thinking about a revolt against reCaptcha don't bother. If say 10% of users started to put in wrong second terms in protest Google would only have to implement a system of mixed checks. Sometimes the second word could be the test and the first the scanned word, and sometimes both words could be a test. People who repeatedly type in the wrong second word could easily be dealt with (up to loss of access to the authentication service and less free web product). But for right now, if you have a moral issue with Google, you can type in what ever you want for the second term.

Frankly I am surprised the entire reCaptcha thing has not been majorly hacked. I remember SETI's perfectly alturistic and open efforts to get people to give their unused computing type to the search of intelligent life in the Universe. The program was made impossible by hackers very quickly, proving that there may or may not be intelligent life in the Universe, but there is none on Earth.
