Burn Before Reading


Last week, spammers discovered blogs. Many blogs -- not this one! -- use tools to crawl their referrer logs; these tools then update the blog to include a list of "incoming links." It's a step towards a nice idea -- I've ranted before about how good it would be if all links on the Web were bidirectional -- but it suffers from a fatal weakness. Reload a page a few thousand times with a fake source URL and that URL will jump to the top of the "incoming links" list.

This is exactly what a few enterprising spammers have started doing; they're placing text ads on blogs that didn't even realize they had commerical "sponsors." So far, the big money appears to be in selling these tools to desparate (and dumb) online marketers.

I was pondering this business model today with a measure of awe. It's brilliant, and it's wholly evil. And then I thought of something even more evil. Comment forms.

It's so ridiculously easy. Many blogs -- not this one! -- have comment forms attached to every post. All you have to do is go to a blog, click on a "comment on this post" link, type MAKE MONEY FAST into the "Comments" field, and hit the submit button. Bingo: that's an ad.

If you're a robot, you can do this to an awful lot of blogs awfully quickly. Finding blogs is no problem. There are central indices of blogs -- and lots of blogs link to other blogs in very standard ways. Parsing for the comments links is easy -- there are only a few major blog tools with significant market share and each has a standard HTML format for comments links. And the forms themselves are trivial -- they're just CGI scripts; all you need is a target URL (easily parsed out from the source) and you're all set.

The bad guys are going to figure this out soon. Someone out there is probably already writing a script to do it. And when this villain unleashes his dastardly tool on an unsuspecting blogiverse, the devastation will be profound. Comments sections are going to become unusable, and probably almost overnight.

There are other comments architectures that are more secure against such attacks. The anti-spam vigilantes have a whole toolkit: centralized IP block lists, heuristic text-matching filters, (technically) mandatory registration, moderation, and others besides. But retrofitting these features onto existing blog-comment systems will take time, programming effort, and user upgrades. A long time, a lot of effort, and many, many upgrades.

Get started now, guys. Now.

Update (2 Nov 2002): It turns out this is one of those Darwin-Wallace cases of multiple simultaneous independent discovery. Someone out there has been trying this attack, albeit in a crude form. And several very good discussions of the security implications and possible responses have been taking place over the last few days. dive into mark has what seems to be the canonical summary.

Ironically enough, I found this discussion by reading my referrer logs. I discovered I'd been linked from BoingBoing. The discussion forum there contained a note saying "This Laboratorium guy needs to get out more" and linking to Mark's discussion. Now, while it's an open and notorious fact that I need to get out more, I had never heard of any of the blogs on which this issue was being discussed until today. Nor did the thread make any of the meta-blogs I follow. Since the discussion is so recent, it's not in Google.

In short, I had no way of knowing that there was a thriving discussion of these issues already taking place. It would be nice if there a way of finding existing -- including recent -- discussion (which would, of course, taking some form of searchability). If the Ted Nelson-style Internet Utopia is going to be built on the backs of blogs, some sort of functionality of this sort will be an important prerequistite.