Home > BlogTech > There Was a Filter, After All

There Was a Filter, After All

March 11th, 2006

I recently commented on a spam deluge that hit my site, from a spammer who used a non-filterable style. All the spams had 3-4 links each to non-spam web sites (news, educational, movie web sites), and all had different fake IP addresses, emails, names, and messages. It seemed there was no way to filter the things. None got through and appeared on my site, mind you–but enough got through to moderation that I had to lower my limit of links allowable per comment as the only way I could figure to block the flood from littering my moderation queue. This kind of spam deluge happens every so often, and is probably a test by spammers to see whose blogs are most vulnerable to attack.

In my recent post, I noted that there was no way to filter the things outside of a straight URL count. Only after the flood ended a few days ago did I realize that there actually was a filterable, but I had missed it. It was subtle, and was undoubtedly the manner in which the spammer could search for which blogs fell prey to his little test. It was a bogus HTML attribute. Usually, a link command in HTML goes by the simple form: <A HREF=”address“>. “A” is the link command, and “HREF” introduces the address of the web page being linked to. Sometimes there’s a “TARGET” attribute which directs where the link will open into, as in a new window.

Well, this spammer sneaked an extra tag into one, and only one, of the the links in each spam; the tag was REL=”itsok”. The “REL” tag is supposed to describe the relationship of the link to the page. Some theorized that the “itsok” spam was intended to foul up anti-spam measures. The “REL” attribute in links can hold the value “nofollow,” which causes Google and other search engines to fail to recognize the link and therefore deny the spammer “Google Juice.” The introduction of a fake “rel” attribute might be intended to break the “nofollow” attribute. However, this spam flood only links to legitimate sites, not spam sites, so there’s no point in breaking the “nofollow”–unless, again, it is a test.

But others theorize that this kind of deluge is an attempt to corrupt the spam blacklists. When spam hits blogs, the URLs in the spam get reported to central blacklists that many people draw upon. The theory here is that if enough legitimate sites get entered into the public blacklists, then they will become useless as real sites will be blocked along with the spam sites.

When I did a search on Google Blog Search, I found three other bloggers who had found “itsok” attribute in a November-December spate of attacks. Others have found it as well. A regular Google search will find people talking about the attribute, but it will also reveal all the sites that did not block the spam–which is what the spammer undoubtedly used to beef up their spam lists. Add the word “spam” to the search and you’ll get a more pure list of people talking about the itsok spam.

I originally missed this whole “itsok” thing because my email program (where I see the spam being reported) translates the link code into an actual link, hiding the code. In the future, I’ll be sure to examine the spam much more carefully. Had I known, I could have just added “itsok” to my blacklist and avoided restricting links in comments. Oh well, live and learn.

Categories: BlogTech Tags: by
Comments are closed.