September 24, 2003

Hiding email addresses from spammers' evil robots

Yes, you too can help fight spam. Here are some tips, in two sections. The first section applies to everyone who goes online. The second is for those who have websites (which also includes those of you who only have a page or two of family pictures, news, and contact info).

Basic unpleasant fact: spammers run automatic programs on their computers that surf across the web, searching for email addresses. These programs are commonly called robots, spiders, or bots. Not all of these programs are being used for nefarious purposes: indexing services like Google run them as well. Webmasters try to permit access to legitimate robots, while blocking spambots. Nonetheless, such blocking can't be counted on, and many sites don't even make the effort.

What to do? NEVER post your email address (or those of anyone else, for that matter) in robot-readable form online. If you participate in an online discussion group, replace "@" with "--AT--" or something similar in any address you post. For good measure you can also replace ".com" or ".net" with "*DOTcom" or "DOTnet". If this seems excessive, ask a webmaster: spambots really are everywhere.

The above advice can also be applied to your webpages. An alternative is to replace all or part of the characters of the address with an image (gif format is usually most efficient). At minimum, the giveaway "@" sign should be so replaced. Note that this may be a (minor) obstacle for those using browsers with image loading turned off; it can also be a bigger problem for the blind, and others relying upon browsers that convert text into sound.

A more elegant solution, that also permits the email address to be clickable, is to hide the address in plain (human) sight by means of javascript. So far (and likely for some time to come), bots just read raw HTML -- they do not run scripts. So you add a simple script that takes an address that is unrecognizably broken up (to a bot) in HTML, and renders that address in human-readable form in the viewer's browser. An example of such a script is here; further elaboration is possible as well (on my sites I've substituted ASCII escape codes for letters and symbols, then further broken up the escape codes in the raw HTML).

Note that there are sites which automatically generate an address-hiding script; you enter an address, and you get some lines of javascript code you can cut and paste into your site's HTML. Check these out carefully before you use them, however, as in many cases the addresses aren't visible in all browsers, even if javascript-enabled.

Posted by David on September 24, 2003 11:14 PM

Comments

If you own your own domain (like 'cronaca.com'), you can usually have wildcard e-mail addresses, so [anything]@cronaca.com will get to you -- if so, create throw-away addresses like, uh, cronaca@ritcey.com, for public posting. If they get too abused, just have a rule dump such mail into the bit bucket. It's also an interesting way to see who's selling your e-mail to whom (in the case of commercial websites that ask for an e-mail address).

Posted by: Ben on September 25, 2003 3:10 PM

Interestingly enough, the addresses I've been using for signing up at commercial websites have *not* ended up deluged with spam. Maybe I'm not much of a consumer, but I think the nature of spam has shifted considerably in recent years, with the worst and most deliberately hard-to-block spam now coming not from commercial mailing lists, but from addresses gleaned by spambots and sent out randomly through dictionary attacks.

Posted by: David on September 25, 2003 8:18 PM

Same here. The email address I use to post here and at other sites is a throw away email address that I check weekly. My home email address I give out to family only. Sort of MCI friend sort of thing. I have found that the throw away address gets a lot of nigerian scam/spam. And I guess blog readers have short/small members.
I have gone to some places where the email for a site was a jpg. You had to read it and type it by hand. The bots read type. not a photo of our address. If someone is to lazy to type your email address, they should not be onlie anyway

Posted by: Gunner on September 25, 2003 8:21 PM

I use various addresses @herptilicus.com whenever I sign up for anything, and those addresses rarely generate spam.

That said, the methods suggested in the post are really just trying to fool spambots, and you can expect spambots to get smarter and smarter, and parse obfuscated addressses.

The real solution is to use a server-side form (like I use on http://loxosceles.org/contact.shtml ), although this does require you to be able to run programs server-side (not possible on most free websites, for instance). Also, you have to make sure you're not using an insecure version (one FormMail script is a popular spam target) that could allow spammers to make it look like tons of spam came *from* you.

Posted by: beth on September 26, 2003 12:16 AM

Thanks for a lot of useful suggestions! Below I've summarized some of the methods I've encountered and given them a rating:

Inside Public key (PGP). Pro: Spiders won't look there for a good while. Con: You need PGP/GPG installed to extract it.

CGI form. Pro: Undetectable. Con: People need to use it the first time to get your real email address and some people find that insultive.

Image of text. Pro: Practically undetectable. Con: Text-browsers and cut-n-paste are both hit.

Javascript. Pro: Practically undetectable. Con: Text-browsers and browsers with disabled JS. (They exist!)

HTML obfuscation (the @ thingy). Pro: All browsers. Con: Some spiders may detect them.

Text-wise obfuscation (Elaborated). These are far preferred on Usenet and other plaintext places. Basically, replace bits that not even a smart spider could think of to make a valid email address. Pro: Foolproof if you are original. Con: Not foolproof if you use classical patterns. The strength lies in using illogical (to the computer) patterns.

org.eu.shine@simon

simon atttt shine.eu.org

simon@trolls.shine.eu.org.dragons (Remove all fairy creatures!)

Two important points:

1) Be original. The spiders are not intelligent. Their patterns do not yet evolve.

2) Render the hostname invalid. Some stupid spiders will harvest anything with a @ in it, so make the life of yourself or the person owning the domain name simpler. Even spam that doesn't reach an inbox is still processed by the mail server unless this is invalid.

Posted by: Simon Shine on June 11, 2004 8:13 PM

I think it is possible to program a spider such that it identifies 'AT' or 'at' and puts '@' at the correct place in the email address of a person. In a similar fashion, i do think that it is possible to put '.' for 'DOT' or 'dot' and
'com' for 'COM' and get the email-address easily.

My suggestions for hiding from web-spiders --->
I think, we have to go in for using Images and over which the Email-Address should be asked and it should be encrypted for the spider's eyes and
should display the actual email address once we move the cursor over that.

We have still sharpen our tricks against spiders.

Cheeeeeeeeeeeers,
karthik bala guru

Posted by: karthik bala guru on October 8, 2004 3:21 AM
Post a comment




  Remember Me?


(For bold text to display correctly, please use <strong>, not <b>)




Google