Home > Main > Those Clever Spammers

Those Clever Spammers

October 19th, 2003

Recently my spam has reached in excess of 30 per day; while my Eudora spam filter is good, once in a while things slip through, or legit mail might get junked. The problem here is that none of the automated spam filters is 100% reliable, and you never know exactly what it is they’re doing. For example, I use email forms on web pages to give my students tests. Once in a while, one of the emailed tests lands in my junk box, and I have no idea why, no understanding of what triggered it. To better my odds, I decided to try strategizing some spam filters of my own.

Available on most web host control panels is the ability to create your own customized spam filters that will block incoming messages before your email program gets a chance to download them. The trick was to find what text strings were unique to spam emails. So I started up Eudora, which houses my 4-year-plus archive of tens of thousands of emails, including a good deal of archived spam, and I started going through the most recent spam, email by email, looking for something that seemed unique. When I found something, I would do a search for the string in Eudora; if the search produced only spam, I had found a silver bullet. If legit email crept through, the filter could backfire on me. After all, since this completely blocked the email, I would not have the option I have with Eudora’s junk filter, that being to quickly scan what got into my junk box.

Pretty soon I started finding gems. “opt-in” was one–from those fake offers to remove you from their list, really a trap to identify people who read their emails. Others included “guaranteed!” “100% money,” “to be removed,” “generic” (drug sales), and “enlargement” (you know what that’s about). These terms produced few enough legit email hits that I felt comfortable using them.

But I also started noticing the tricks that the spammers used to avoid detection. Like “Viagra,” a very common spam topic, was being spelled in various ways, like v1agra, v_iagra, viagara, v.iagra, or v1a-gr@. I found that by filtering “v1a” and “agra ” (with a space at the end of the last one), most of them got filtered out. Some spam included the intra-word period all over the place, like “get doc.tor recom.mended v.iagra!” Thus breaking up the words for a standard filter.

One rather devious word-breakup scheme I found used HTML, the web page scripting language which is now commonly encoded into email to create various tables, backgrounds, text sizes, styles and colors, not to mention images (more on those below). I found one email that clearly read, “Free Tour,” which seemed like a possible filter. I searched for it, but nothing was found–not even the spam I’d seen it in. I couldn’t figure out why it was not showing up, until I turned off the “blah” filter which usually hides the HTML and other coding. Then I found this:

<!– flow –>Cl<!– gutenberg –>ick
he<!– baby –>re for a F<!– predominant –>R<!– reagan –>EE to<!– pint –>ur

See what they did? The <!– * –> is an HTML code for a memo, a note programmers insert into code to mark something; it doesn’t actually do anything, it’s just a tag which doesn’t show up in normal display mode. By using these tags all over the place, the spammers not only break up the words that would normally trigger a spam filter, but they add words that might be what some people really search for. And because the comments do not appear in rendered form, the original message–in this case, “Click here for FREE TOUR”–comes across clearly to the reader.

But many spammers use a more straightforward dodge: they don’t send any text. Instead, their ads arrive in the form of image links–not usually images they send to your email account, mind you–that’s bad enough. No, instead, they use the HTML image link feature to tell your email program to go to their website and retrieve the image, then show it in your email when you view it. This not only blocks the spam filters, but it also lets the spammers know who just opened their email, so they can send even more to you. Most people don’t know that you can and should turn off HTML graphics options in your email client.

Nevertheless, I was able to find an Achilles’ heel to this practice: they use the DIV tag. An HTML feature that allows alignment and other formatting control. Most graphics spammers use the tag to align their images. And when I did an email search for “div align=” in Eudora, I hit the jackpot: half the spam email showed up, and NO regular email. Bingo!

So I entered that and a few other choice terms into the spam filter, and the next day, the spam that Eudora had to filter out was cut by about 60%. Cool. So I’ll do some more searching, and maybe get it fine-tuned. With luck, the number of spam emails I have to glance at in my junk folder will be down to only a few a day, and it will be extremely easy to spot the legit emails. And when spam starts getting through in larger numbers again, I’ll just figure out the trick and add the spam filter to catch it.

I admit, this may very well be more trouble than it’s worth, but then again, it can be fun to try to outsmart the vermin.

Categories: Main Tags: by
  1. chris
    October 20th, 2003 at 11:13 | #1

    i dig your blog.

Comments are closed.