Prepare for more comment spam, not less

Perspective — Upon learning that search engine firms and blogwares endorsed the rel=nofollow attribute, I had some doubts the stratagem would ultimately lead to less comment spam. And from the looks of this interview with a link spammer, I’d even suggest you should get ready for more spam, not less.

In case you’re not familiar with the basics, so-called “search engine optimizers” are out there to get their customers’ web sites on the first page of search engine results. And one thing that counts according to Google and its kin is the popularity of web sites, which is measured as the number of links to the web site.

Now of course, you might want to challenge that simply counting the number of links isn’t necessarily the best method to measure the relevance of a web page or a web site. But it is better than not measuring anything at all, and search engine firms process the raw results in very much the same way as sociologists process the results of their surveys. For instance, you might want to give more weight to a link from a generally well rated web site (e.g. Google PageRank), give more weight to a link from a page on the same topic (e.g. Teoma’s notion of authority), and even reprocess some information on grounds you know some patterns are more/less relevant than others (i.e. fight spam).

Anyway, the importance of links inevitably led to the link farming business. And to farm links, you can:

  • Get listed in the thousands of existing web site directories
  • Set zillions of web sites up yourself and make them link to your site
  • Comment-spam whichever web sites allow user interaction

The “web directories method” is safe but expensive. At best, you’ll semi-automate it by hiring Chinese web surfers to do job.

The “link farm” method is comparable to a cancer. The farm sites link to one another in very much the same way pages on your web site link to one another. The whole shapes more or less organic clusters, which you then analyse and process as such. In the end, when a cluster is very dense in links within itself, but never or hardly ever receives an outside — read: legitimate — link, you eventually notice it and eliminate or downgrade it, since you’re probably facing spam.

The “comment spam” method is obviously the best of the lot and is comparable to metastasis: It is easily automatable and costs virtually nothing, it gives high PR links coming from various locations because — until the next Google dance — blogs typically have high PR. Moreover, it goes undetected until it is too late, because search engine rankings are ultimately based on self-fulfilling prophecies — once you’re on top you tend to stay there.

That reminded, would you expect putting rel=nofollow attributes inside the comment links on your web site will remove all incentive to spam it? I certainly wouldn’t. Read Sam’s opinion on the topic:

To start with, spamming for casinos and porn sites looks a lot more lucrative than my job as freelance consultant, and Sam is in no way interested in going out of business. Thus, as long as he finds ways to comment-spam in any efficient way, he will.

Then, notice he compiles target lists “by searching for sites with keywords such as ‘Wordpress’, ‘Movable Type’ and ‘Blogger'”, and not “by searching for (…) ‘Wordpress’ (…) without the rel=nofollow attribute.” And you’re fully aware, of course, that spammers seldom bother cleaning their databases up.

What more, the incentive to comment-spam is two-fold for his customers: Millions of blog readers visit millions of blogs every day, so comment-spamming blogs or email-spamming blog authors and readers is more or less the same.

Last but not least, the very reason that led search engine firms to introduce the rel=nofollow attribute will ultimately play against them and against internet as a whole. Indeed, the massive adoption of technologies such as pingbacks will ultimately lead to more links, and many will be relevant. As such, assuming no major breakthrough occurs in the next few years, the same search engine firms that pushed rel=nofollow will ultimately be choosing between ignoring entire chunks of relevant information, and letting comment-spammers farm links.

In the end, despite the rel=nofollow initiative, I’d expect more comment-spam, not less.

Comments on Prepare for more comment spam, not less