Home > BlogTech > Beating the Spammers

Beating the Spammers

November 13th, 2006

About a month ago, somebody asked me if there was a way they could write their email address on a web page and yet avoid having it picked up by spammers. I was about to tell them it was impossible, but then I had an idea–and it seems to have been a good one. I tested it, and indeed, it did work. The best part is, it incorporates a technique used by spammers themselves, and beats them at their own game!

A little background first. Putting an email address on a web page, or for that matter, anywhere on the Internet where the public can see, is an open invitation for spam. Spammers use automated programs, called “bots,” to “harvest” email addresses. The bots scour every last web page, discussion group, and other public piece of information on the Internet for anything that looks like an email address. When they find one, they add it to a list, and start sending spam to it.

I know this is true because I have tested it. FIrst, I create a brand-new email address (e.g., “brandnewemailaddress@blogd.com”), one which has never been used before, and one which no one but me knows about. I then put the email address up on this blog’s main page. To ensure that only spammers can see it, I make the address the same color as the background, rendering it invisible to the human eye. Five months ago, I put up one such address, and after a week, spam started coming through; after one month, it had drawn 41 spams; this week, it has been getting about 7-8 spams per day, and has collected about 500 spams altogether.

So, I know these bots are constantly surveilling my web site. I know that any email address posted in such a fashion will be picked up. So how could I post an email address and not have it be picked up?

The idea came from a technique I saw spammers use themselves. When spammers send email, they know that certain words will trip spam filters, and that will send their spam to the waste pile, where it will never be read. One key word, for example, is “Viagra.” So spammers who want to use this word will try to disguise is. One way is to misspell the word, for example, “V1@gra” or any other of a hundred variants. But the technique I saw spammers use years ago works to foil the spammers themselves.

It involves the use of HTML code. HTML is the language used to write web pages. If you go to the “View” menu of your browser and choose the “View Source” command (or anything that promises to show the “source”), you’ll see the same page, but as it appears originally, the source code. You will see that it is filled with stuff inside <angled brackets>. On a web page, anything in angled brackets is considered a command for the browser. As a simple example, “<b>” is a command to make text bold. One “harmless” command is <!– text –>. That is a comment command, an exclamation point followed by double-hyphens within a set of angled brackets. It doesn’t do anything, it’s intended solely as a comment in the code. Because it’s an HTML command, it does not “render,” that is, it does not get shown to the viewer on the web page; it is “edited out” by the browser.

Now, spammers used to use this as a way to break up a word so it would get past the spam filters for email. For example, instead of writing “Viagra,” if they instead wrote “Vi<!– text –>ag<!– text –>ra,” the spam filters of the time would not see the word “Viagra,” but since an email reader will render HTML code, the stuff in the brackets would disappear for the person who was looking at the email, and they would see “Viagra” in the clear. Clever! Until, of course, the email spam filters were updated, and it no longer worked for spammers, so they stopped doing it.

But apparently, the spammers never updated their own bots to filter out their own trick! I tried writing an email address in the clear, broken up by this old spammer’s trick, and it has been a month, and not a single spam has been generated! In all other tests where I put up email addresses, spam started coming within a week, and dozens had come by the end of the first month.

So if you want to post an email address so that people can see it but spammers (who are not people, after all) cannot, then add those comment commands within the HTML code on your web page. Look at this new email address I just made:

spammerssuck@blogd.com

Now, I didn’t really type that in the HTML code. You see it as being in the clear, but if you were to look at the HTML code for this page, you would see that it really looks like:

spa<!– toy –>mm<!– blue –>erss<!– bottle –>uck@blo<!– phone –>gd.<!– box –>com

Note that in the HTML comments breaking up the email, I inserted random common words–another spammer’s trick, to throw off filters. But really, you could probably put anything in the comments, it likely doesn’t matter.

Now, will you be safe by doing this forever? Hard to say. Spammers might never pick up on this, or they might write a fix for this tomorrow. Heck, they might read this blog regularly, and I might be tipping them off. My guess is that they won’t bother changing their code to account for this trick until a significant number of people start using it. So if you’re forced to put your email address up on a web page anyway, and you don’t have access to complex coding that might protect it, then you might as well give this method a try.

Categories: BlogTech Tags: by
  1. Paul
    November 14th, 2006 at 13:48 | #1

    You want a weird spam item?

    Fox News has been spamming me.

    In the comments, in the admin panel I can see what IP address a commenter is posting from.

    About a half-dozen times in the past month or so, I’ve gotten a comment that doesn’t really make sense in the context of the post, contains a single link to a Fox News story, and the IP address that the commenter wsa from goes back (via WhoIs) to an address in Fox News’s block.

    Very weird. It’s not quite often enough to make it worth tracking down, but I almost wonder if they’re intentionally targeting liberal sites to screw up search engines or get their own site higher ranked or something.

    Paul
    Seattle, WA

  2. Luis
    November 14th, 2006 at 17:36 | #2

    Paul: Actually, this is almost certainly not Fox doing this. What you’re experiencing is a common spam phenomenon. Sometimes spammers send out useless spam. Sometimes it is a test of their systems, and they include a special text string that they will be able to use Google to find in order to judge penetration. Other times, they send spam which is empty (a malfunction?) or, even more commonly, spam which has links to major web sites, often for news and computer sites. It is not 100% clear why they do this, but many believe that they are trying to attack master spam lists. Some organizations keep master lists of spammers. Individual blogs can often add offenders to the spam lists automatically. The list is then used to screen spammers. If spammers can add enough legitimate sites to the master spam list, they can foul it up and make it less usable.

    If you go to the “Blog Tech” category (on the sidebar at right), you can read the entries about spam. In several of them, I have described all manner of spam received, which you can expect to get yourself at some time. One specific post is this one, but there are others as well. A variation is described here.

  3. MTEXX
    February 11th, 2007 at 04:06 | #3

    Some good ideas in here. I’ll be using the comment-in-email from now on!

    I wish I had a javelin missile and street address of every spam house in the world.

Comments are closed.