A New Way to Fight Blog Comment Spam

Spam in blog comments has become a problem in the blogosphere lately. Bloggers have been busy manually deleting entries, blocking IP addresses and some people have come up with comment spam filters that use keywords and such in a similar way as spam filters do.

Now here’s a thought: Comments are sent using forms on web pages and these pages are controlled by the blog owners – right? This means it is radically different from email spam, where the sender’s only connection to the recipient is knowing (or guessing) his or her email address.

I believe a solution to the problem would be to require the sender to do something “uniquely human”, similar to the image identification methods used by many free email services to fight of robot registrations.

As an example, a comment page could ask the commenter to “Click the image of the duck, the dog and the diamond to post your comment” and then display a selection of, say 9 pictures to choose from. These would again be selected randomly from a huge set of such pictures. If he or she gets the sequence right, the comment is posted, but fail to match the sequence in, say, 3 tries, and your comments are blocked. This is almost no additional hassle for the commenter, but is a task that – if the system is decently implemented – would be very hard for a spam program to get correctly.

11 comments

  1. Technically, that is an interesting and fun suggestion. Practically, however, it doesn’t achieve what you would like it to achieve.

    In fact, using your test, OCR and image recognition technology would make some computers more human than blind (or partially sighted) people, users using text-based or aural browsers (lynx, JAWS, etc etc), or users on slow connections (who may prefer to browse with images off).

    There are better ways, but unfortunately assuming that people use the same methods of access that you do is not one of them…

  2. Good point Jay.

    While the groups you talk about are a miniscule part of the web surfers, their needs should obviously not be overlooked. As for the slow connections, a quick calculation tells me that this could be implemented using images that would be less than 1KB each (even as low as 100B by using b/w images).

    A solution for the visually impaired could use oral hints instead of images, maybe: “type the common name of the animal that says MIAAAW”. A similar approach could probably be implemented for text browsers as well.

    So I still believe that this method could be used, even though the exact implementation may vary.

  3. Indeed, offering MULTIPLE choices of rich-media test would be the way to go to make sure that you cover all the bases. Plus an easy email link to the site owner in case someone has trouble with the available choices….

  4. The assumption made when suggesting these types of systems (known as captchas) is that comment spam is left by automated bots. I don’t think that’s the case. There are several pieces of evidence that suggest comment spam is placed by real people, not machines. These people would have no problem clicking on the duck.

  5. Another interesting point. In this entry on your blog, I see your analysis of your log files, clearly indicating human behavior. Some of the discussions on comment spam (including some of the discussions on your blog), show that much of it is still being posted by robots.

    My take on this is that as long as what we’re up against is “only human”, it can be fought by us removing the spam as soon as we find it. That way the spammers will give up wasting their time and return to their natural habitat of email (or searching for new ways to bother the rest of us). Spam filters can certainly be used to help in this fight.

    Robots on the other hand won’t give up so easily, and using captchas (thanks for improving my vocabulary) will help keeping them away.

  6. Ha!
    You (posting comments here) seem to assume that all robots are bad. How about a hypothetical (I think) scenario when a human designs a robot just to spread a useful thought? Mind here, a carefully designed artificial that intelligently looks for a few discussions that might benefit from the right comment, even though it is distributed by a robot.
    (btw. Thanks Jay for your “more human than blind people” statement. I sort of collect phrases like that…)

  7. Being a tech geek I wonder about the implimentation.
    Forms have one thing in common, i.e. the tag. I have not studied those spam bots, but I guess they filters out the form, its parameters and submit url and submit some content.
    The easy way to solve this and use captchas and make it accessable to all users is by using radio buttons with value=cat, value=dog e.t.c. and changing the submit url with javascript if the user selects the right answer 🙂
    …but is it worth the hassle?

  8. Personally, I find spamming very annoying. I’d like to see solutions that don’t require more work on side of the users.

Comments are closed.