Thursday, September 17, 2009

CAPTCHA

For some time I have been filling out CAPTCHA images on various sites. At an abstract level I knew the image challenge was to prevent non-human users from gaining access to the site. However, I did not know that the term applied is called CAPTCHA until this morning when I came across this announcement that Google has acquired reCAPTCHA. What is interesting still is that each time a human answers a reCAPTCHA powered challenge successfully it enables the company to digitize part of a scanned text. For example, reCAPTCHA may scan an old edition of the New York Times and the OCR will likely, recognize and, convert to text a large percentage of the document. Those pieces which are not recognized are then turned to graphics and presented to human biengs for translation. So how does the computer know that the human has answered the question correctly? Well the challenge is composed of two parts known and unknown. If the human answers the known element correctly it is assumed that the unknown is correct as well. The assumed correct answer to the unkown is then validated against a larger sample to ensure its accuracy.

No comments: