Comment Spam - Fight!
August 20, 2010
A number of websites hosted on UGAL have been hit pretty bad by comment spam over the last few days. Yes, once more. The situation was so bad for a few sites that we had to completely disallow comments for those sites.
First, a big thank you to all of you for your patience. We know how spam is annoying and time consuming to deal with, and truly appreciate your patience while we were implementing new protections.
We are happy to announce that better spam protection is now in place for all our websites, and that comments have been turned on again for all. We are pretty confident that we are much better protected now, while keeping it as easy as possible for your visitors to post comments.
What is comment spam?
Comment spam is spam that is automatically posted by bots on HTML forms that they find on the internet. They can be any type of form (blog comment, contact form, product information request etc...), the bots do not care. They are just happy to fill them up with whatever "message" they want to send. Millions of forms are filled in every hour, and with only a tiny fraction of a percentage of their readers clicking a link, bots have a decent chance to make money out of this shady business.
What is the new protection like?
We wanted to make it more difficult for bots to post their comments, while not changing the user experience for legitimate users. That is why we decided against the Captcha route. Instead we implemented a number of known techniques that bots do not like:
- The name of the form fields (in the HTML code) are now obfuscated, and change at every page request. It makes it much more difficult for bots to automatically put an email address in a field because its name is not "userEmail" anymore but "F8336290227B801".
- Honeypots: a number of fake fields are added to the forms and hidden from the users. Bots do not understand the techniques to hide those fields and have a tendency to fill them with information. Stupid bots. When a form is submitted with data in one of its honey pots, it is marked as spam.
- Timers: bots are much faster than humans and can fill forms very fast. When we detect that a form has been filled too fast, we mark it as spam.
- Content validation: if a form submission passes all the tests above, our servers send the form to a service provider for content analysis. It takes a couple hundred milliseconds to get a "spam" or "ham" reply from the service. We were not too happy with the service we were using, and have switched to TypePad AntiSpam. It has very good reputation, let's see how it performs for us.
What is next?
For now we will monitoring how the new protection performs. From the last few hours of use, it looks like it is very effective: less spam submissions, all spam properly detected and all legitimate comments made it through the system. We will adjust settings based on what we see, and hopefully we will be able to call it good for a while. So that we can concentrate on more positive developments...
Thanks again for your understanding while your inboxes were filling up with junk. Comments are open below, do not hesitate to use them, or email us at firstname.lastname@example.org for any question.