Referrer spam, comment spam, trackback spam, spam, spam, spam, spam, hams, eggs, and spam
Web site operators are subject to different sorts of spam than the regular public is. Spam isn't just the unsolicited email that gets all the attention, just as the spam talked of on this page is not the packaged meat product by Hormel Foods. ([w:spam|SPAM on the Wikipedia])
First one needs to understand what the referer is. Every web page request includes data saying which page refered the user to visit the given page. This happens when someone clicks a link on a web page, the page they end up visit is told which page the refered them. But if someone types the URL in the location bar, then there's no referer.
It's very helpful in studying your website traffic to look at the referers, so that you understand where your visitors are coming from.
Unfortunately the referer is very easy to forge. This has resulted in spammers making requests from thousands of web sites, each request with a forged referer pointing to their own web site.
What's the advantage? It's twofold:
First, it causes curious website administrators to visit the site that did the spamming. Suppose you're reviewing your website statistics and see a link from a site with a URL talking about poker, or hot girls, or whatnot, why would you get a referel from such a site? Of course you're going to be curious, why is that site referring to my site. That makes you click on the link and go there, at which time you've become a new visitor to that site.
Second, sometimes the website statistics reports get published somewhere that is reachable by a search engine. If the search engines scan your website statistics, and see referer spam, the search engine will count that spammed link just as if it was a "real" link. The more links the better, right?
A nice overview article on referer spam: http://www.spywareinfo.com/articles/referer_spam/
Detailed article on referer spam: http://www.kuro5hin.org/story/2005/2/14/02558/3376
Another detailed article: http://www.abcseo.com/papers/referrer-spam.htm
An apache specific method to block referer spam: http://www.ilovejackdaniels.com/apache/block-referrer-spam/
Blog posting on referer spam: http://www.unix-girl.com/blog/archives/000264.html
Comment or Trackback spam
This refers to features present in the newer kind of websites, ones using some kind of content management software. A blog site or other kind of content manager driven website contains features allowing people to make comments on the web pages, or to submit "trackback" links.
Comments and trackbacks can both be entered by software, hence it can be an efficient form of spamming zillions of web sites.
You can experience comments on the web page you're reading right now. At the bottom of this article will be a link allowing you to leave a comment. Click on the link (registration required) and type some text and hit the submit button. You can enter a link to your own web site just as easily as any other text. Hence, when the search engines scan my site they might see the link you entered, which then counts as another link to your site.
Trackbacks are even simpler. You'll see on this page a section saying "Trackback URL for this page" followed by a URL. Most blogging software allow you to send a trackback, which is very useful for web site operators to notify each other when they write an article about someone elses article. And when the trackback is received by the web site software, the software puts a link somewhere pointing to the page that sent the trackback.
That gives the clear motive for both comment and trackback spam. That it creates more links to sites.
A few months ago the search engines got together and came up with a solution to comment and trackback spam. They decided that whenever they see a link that uses the rel=nofollow attribute, the search engine will not count that link for popularity. If implemented this completely cuts the effectiveness of comment and trackback. Unfortunately it's not implemented in all blog or content management systems.
An article on ways to combat comment spam: http://www.sixapart.com/pronet/comment_spam
Referer spam and the rel=nofollow attribute: http://virtuelvis.com/archives/2005/03/killing-referrer-spam-forever
Comment spam clearinghouse: http://www.jayallen.org/comment_spam/
Combatting website spam with wordpress: http://codex.wordpress.org/Combating_Comment_Spam
Article on fighting trackback spam in movable type: http://www.elise.com/mt/archives/000577trackback_spam.php
Article on fighting trackback spam in wordpress: http://blog.mytechaid.com/archives/2005/03/09/wordpress-trackback-spam-s... ... and another: http://www.bloggingpro.com/archives/2005/01/05/fighting-trackback-spam/
A plugin for wordpress to aid fighting spam postings: http://unknowngenius.com/blog/wordpress/spam-karma
Another wordpress module -- http://idli.cs.rice.edu/~dsandler/trackback/trackback-validator-plugin/ -- which checks that the referring page includes a link to the page being trackback'd. This simple check should block most spammers, because they'd have to generate a unique page for each trackback they send.
An individual blogger comparing typepad with wordpress for blocking trackback spam: http://blogging.typepad.com/how_to_blog/2005/07/fighting_trackb.html