Ed Price is Hungry

(but not very often)

Tip: Get a haircut

Preventing comment spam (cont.)

Death to the autobots

You’re probably all-too familiar with the godawful Captcha screens (and variants) that appear on many web forms. And, after that sentence, you’re probably also well aware of my feelings towards such techniques. Captcha is based up on the Turing test principle, which is really just a way of testing whether someone’s a real person or a machine. The typical method is to include an element on your web form that requires cognition, as opposed to mere calculation – machines can calculate, but they’re rubbish at thinking, even the ones that look like Scharzenegger.

Captcha works because you have to look at a picture and determine what words or letters are displayed on the picture. Captcha doesn’t work because to ensure the picture can’t simply be scanned and OCRed by a machine the words need to be obscured the to such a degree that even a human can barely work out what they say half the time. These days I tend to view the employment of Captcha as a usability failure.

Nevertheless, the idea of using a Turing test is sound. For my blog I use a simple maths sum: I give the user two numbers and ask them to add the two numbers up. This might cause minor issues to those who have genuine problems with basic maths, but a calculator is never far away on any computer. On my initial attempts I stupidly included the answer as a hidden field on the form – needless to say the spam kept rolling in. I quickly fixed that and, to make it a bit harder for bots, the current version requests the answer in digits, but phrases the question using words. Here’s the basic script:

$numbers = array("zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine");

$a = rand(0,9);

$b = rand(0,9);

$answer = $a+$b;

The word values for the numbers are placed in an array – those familiar with array structures will immediately see that the key for each entry in the array equates to the value (with the first key in a standard array always being 0). In other words, the key for “zero” is 0, the key for “one” is 1, and so on.

On the second and third lines we generate two random numbers between 1 and 9 (as a digit, this time). Note that we use a consistent range of 0 to 9 throughout, this range can always be expanded if needed, but the array and the range for the random selector need to match. On the fourth line we calculate the sum of the two random numbers selected: this value is used to verify the answer that the user enters on the form.

We do have a slight problem however: we need to tell the processing script which answer was generated by the form script otherwise how can it verify whether a user has entered the correct value or not? If your processing script is on the same page as your form then it’s not such an issue, but for added security I have the form and processor separated – therefore I need to send the user-entered answer as well as the generated (and correct) answer to my processing script in order that they can be compared.

To resolve this we use a random value to ‘salt’ the answer and then we encrypt it:

$s_answer = md5($salt.$answer);

The salt value can be anything you want – in the case of my blog it’s partly derived from something specific to both my domain and the blog post itself. Either way, the important thing is to ensure than the value is the same on both the form and processor script.

The relevant form field looks like this:

<input name="answer" id="answer" />

<input type="hidden" name="s_answer" value="".$s_answer."" />

<p>Please enter the sum of <strong>".$numbers[$a]." plus ".$numbers[$b]."</strong> in digits (e.g '19')</p>

The first line is the input field for the user to type in their answer. The second line contains the ‘salted’ answer as a hidden field. The third line tells the user what they need to do and displays the numbers (from the $numbers array) that need to be added together. Note how we use the randomly generated $a and $b values to pull the equivalent words from the $numbers array. You’ll be able to see the real life example at the bottom of this page.

We process the posted values as follows:

$user_answer = (int)$_POST['answer'];

$salt_answer = $_POST['s_answer'];

if( md5($salt.$user_answer) != $salt_answer ) {

$errmsg[] = "Please answer the security question correctly";

}

Simply put, if the salted version of the user’s answer does not match the salted answer posted from the hidden field then an error message is generated and the comment is not submitted.

If you prefer not to resort to basic maths then you could use a similar technique to ask simple questions (e.g. “What color is the sun? Yellow”). However, this may cause problems if someone decides the sun is white, or if they can’t spell yellow correctly, or if they’re think you’re referring to the red sun of Krypton, and so on. Maths is less ambiguous.

Posted:  February 24, 2010 at 16:19

Filed under: Web Design

Author: Justin (contact)

Last edit: February 25, 2010 - 12:21

2 comments

JayZee March 1, 2010 - 18:01

Brilliant article. I've been toying with the idea of using emotion as verification. For example, pointing the user to a news article (or forcing them to read a sentence) and asking them if it made them happy or sad. eg "Cute little puppies in a washing machine = sad" but "washing cute little puppies = happy"

JRC March 1, 2010 - 19:23

Great idea - but what if you like sticking cute puppies in a washing machine? Or is this a form of psychological evaluation designed to stop the 'wrong' people from commenting? If so - it could be the latest thing (just don't tell Murdoch)

Add a comment

All comments are subject to approval prior to appearing on the site.
HTML code is NOT allowed and will be stripped out.

Please enter the sum of two plus nine in digits (e.g '19')

Search

Recently posted
Categories
Tags
Monthly Archives

Feeds RSS logo
Copyright

The content on this blog is protected by a Creative Commons license. This is purely to stop people from doing nasty things with my words - in the unlikely event that you do want to reproduce any content here just ask

Creative Commons License

Ed Price Is Hungry by Justin Cawthorne is licensed under a Creative Commons Attribution-Non-Commercial-No Derivative Works 3.0 Unported License.
Based on a work at www.edpriceishungry.com