A while back, I posted my method for defeating spambots that harvest email addresses. This post is an update to that original method. It explores cleaner, less obtrusive code approaches and more accessible/usable HTML markup.
If you’re impatient and want to jump to some working examples, here you go:
The other “solutions”
So how do you prevent spambots from harvesting your email address? Well, there are a gazillion suggestions out on the interwebs, and unfortunately most of them stink because they require JavaScript, and because they often use illegible or invalid markup. For instance, this example — which was created by an email address obfuscator ranked high in Google searches — uses character entities to render the text completely illegible:
<a href="mailto:this@stinks.com"
Code language: HTML, XML (xml)
This method has been popular for a number of years, but has some serious flaws. First of all, how do you know if you even have the right address in there? Secondly, what’s to stop a spambot from reading character entities? I imagine it would be as easy as reading ASCII or UTF. GONG!
Here’s another popular approach, premised on the notion that spambots look for any links using a mailto:
protocol:
<script type="text/javascript">
function emailme(user, domain, suffix){
var str = 'mai' + 'lto:' + user + '@' + domain + '.' + suffix;
window.location.replace(str);
}
</script>;
<a href="javascript:emailme('this','stinks','com')">this@stinks.com</a>
Code language: HTML, XML (xml)
There are multiple problems with this approach. The first problem is that it doesn’t use mailto:
in the markup. This means if JavaScript is disabled, the link is completely useless. It also breaks the semantics of the links.
The second problem is that the JavaScript is inline and therefore obtrusive. JavaScript should not be mingling with your markup… it’s bad form! Any link that starts with javascript:
is troublesome in my book.
Lastly, the whole address is still contained in the text of the page. If a spambot is sophisticated enough to look for mailto:
protocols, it’s probably sophisticated enough to use RegEx to search for text that uses both @ and a period (.) without spaces.
There are other solutions out there, too, but they all require invalid markup, semantically incorrect markup, or flat-out removal of the email hyperlink. I want a solution that remains clickable when JavaScript is disabled, and doesn’t get all screwy with the markup. These don’t fit the bill. There’s another way.
A cleaner solution
My solution is simple: use an invalid email address. No, really! An invalid address with some extra touches and some unobtrusive JavaScript will work wonders. Here’s how to use it, step-by step:
Step one: Create your markup using a slightly altered address
Begin with a real address, then modify it to include some dummy text. For instance, the address sales@visitwaikiki.com
would be rewritten salesnotspam@visitwaikiki.com
. The spambot will harvest the address salesnotspam@visitwaikiki.com
, which won’t work when the spammers try to use it.
The markup should look like this:
<a href="mailto:salesnotspam@visitwaikiki.com">sales@visitwaikiki.com</a>
Code language: HTML, XML (xml)
There’s an obvious flaw here: The email address is still written in plain text between the ‘a’ tags. We’ll need to use alternate text — if you want to avoid spambots, NEVER use the real address as the visible text in an email hyperlink.
Using something such as sales AT visitwaikiki DOT com is also probably a bad idea, simply because zealous spambot authors can look for that very common pattern and manage to parse the email address. You’re best off using a different phrase, such as:
<a href="mailto:janenotspam@visitwaikiki.com">Contact Jane.</a>
<a href="mailto:salesnotspam@visitwaikiki.com">Email our sales department.</a>
Code language: HTML, XML (xml)
We still have another problem to address: The link works, but it’s using the wrong address! The next step will help with that.
Step two: Improve the markup to make the link more usable when JavaScript is disabled
It’s always a good idea to ensure your visitor can use the email hyperlink when JavaScript is disabled. As it stands, when the visitor clicks the link, their operating system will create an email addressed to the invalid address salesnotspam@visitwaikiki.com
. Without JavaScript, we can’t correct the address, but we can let the user know that the address needs to be edited.
<a href="mailto:salesnotspam@visitwaikiki.com?subject=EMAIL ADDRESS NEEDS EDITING&body=Please remove the text 'notspam' from the address before sending your email.">
Email our sales department.
</a>
Code language: HTML, XML (xml)
The mailto:
protocol allows users to tack on additional information using the subject and body options. Whatever is listed after subject will appear in the email’s subject line. Whatever is listed after body will appear in the message’s body. By creatively using these options in the email address, we can clearly instruct the visitor to edit the address as-needed. The code above this paragraph produces the following email when clicked:
To: salesnotspam@visitwaikiki.com
Subject: EMAIL ADDRESS NEEDS EDITING
Message: Please remove the text ‘notspam’ from the address before sending your email.
Is it a pain to have to include the subject and/or body options each time you write an address? Yes. But is it more of a pain than the hundreds of spam emails you might get each week? I doubt it.
We now have a fully-functioning standards-friendly markup-only spam-resistant link. (Yes, I love hyphens. Don’t you?) Next, we’ll improve the experience for the 95% or so of your visitors who have JavaScript enabled.
Step three: Use JavaScript to make the link behave normally for most visitors
Most of your visitors will have JavaScript enabled; let’s take advantage of this and improve their experience. Our primary goal with our script will be to correct the invalid address by removing the dummy text “notspam”. However, since we’re removing the dummy text, we’ll also need to remove the instructions contained in the subject and body options so we don’t confuse the visitor.
Here’s a simple function that scans the page for all email links, then removes the dummy text (assuming all links use the same dummy text), the subject option, and the body option:
onload
approach
window.onload = function (){
var links = document.getElementsByTagName("a");
for (var i=0; i < links.length; i++){
if(links[i].href.indexOf("mailto:") !== -1){
this.href = this.href.split("?")[0].replace("notspam", "");
}
}
};
Code language: JavaScript (javascript)
This teeny bit of JavaScript executes when the page loads and makes all email links behave as expected. Now we have a fully-functioning standards-friendly spam-resistant email link that also degrades nicely for visitors without JavaScript, and looks/feels completely normal to everyone else.
However, if you’re paranoid like me, you’ll wonder: What if the spambot supports JavaScript and looks for email addresses after the page has loaded? Your email address would be just as vulnerable as it was before.
A quick tweak to the script can help: instead of cleaning the addresses when the page loads, we can choose to only clean an address when the link is clicked.
onclick
approach
window.onload = function (){
var addressCleaner = function (){
this.href = this.href.split("?")[0].replace("notspam", "");
this.onclick = function (){};
this.oncontextmenu = function (){};
};
var links = document.getElementsByTagName("a");
for (var i=0; i < links.length; i++){
if(links[i].href.indexOf("mailto:") !== -1){
links[i].onclick = addressCleaner;
links[i].oncontextmenu = addressCleaner;
}
}
};
Code language: JavaScript (javascript)
Note: all modern browsers treat a link as ‘clicked’ if you tab to it and hit enter on your keyboard, which means the link remains accessible to those using keyboard navigation and/or screen readers.
Also, notice the oncontextmenu
code; when a link is right-clicked, the onclick
event isn’t triggered. If a person right-clicks the email address to copy it, they would be copying the invalid version of the address. Using the oncontextmenu
event fixes this problem.
You’re done!
You now have a spam-resistant email hyperlink that works whether JavaScript is enabled or not. It adheres to standards (no invalid markup), is semantically correct, and is unobtrusive.
Having said that, you should be aware that this system is not perfect; spammers are very clever, and will always catch up to us. This method is a form of spam resistance, not a foolproof way to defeat all spambots from now until eternity.
While the code you’ve just seen will work fine for most people, there are a few improvements that can be made with the use of a JavaScript framework. If you don’t use a JavaScript framework such as MooTools or jQuery, your journey has ended. If you do use a framework, let’s explore some potential improvements to the system.
Improvements via frameworks
JavaScript frameworks add some impressive tools to our toolbox and provide many conveniences. For this example, I’m going to use MooTools 1.2, but most other frameworks will have similar code that you can adapt for your own needs. Here are some improvements we can make:
- Use event handlers instead of direct assignment.
- Use a domready event instead of
window.onload.
- Use CSS selectors and the
array:each
method
Here’s the improved code, modified to use MooTools 1.2:
window.addEvent("domready", function(){
var addressCleaner = function (){
this.href = this.href.split("?")[0].replace("notspam", "");
this.removeEvents({
"click": addressCleaner,
"contextmenu": addressCleaner
});
};
$("a[href^=mailto:]").each(function (a){
a.addEvents({
"click": addressCleaner,
"contextmenu": addressCleaner
});
});
});
Code language: JavaScript (javascript)
Explanation of the MooTools framework version
Since some of you may not be familiar with frameworks, so I’ll try and explain the changes I’ve made.
Event handlers
Most JavaScript gurus will tell you that using event handlers is a much more robust approach than using a direct onclick
assignment. For starters, adding an onclick
event using direct assignment will overwrite any existing onclick
event. Using an event handler will ensure the new event will not destroy any existing events, and will simply add the new event to a queue of events.
//Direct assignment
a.onclick = function (){
//Do something
};
//MooTools event
a.addEvent("click", function (){
//Do something
});
Code language: JavaScript (javascript)
As you can imagine, if you don’t use a framework, browser support and cross-browser incompatibility issues make event handlers a bit of a pain. This is one of the primary reasons frameworks have become so popular: they take the pain out of cross-browser compatibility.
Change window.onload
to a domready
event
The domready
event is executed earlier than an onload
event. domready
basically means that all markup has loaded into the browser DOM, even if images and other media haven’t finished downloading yet. onload
, by comparison, only fires after everything has finished loading. A MooTools domready
event looks like this:
window.addEvent("domready", function (){
//do something
});
Code language: JavaScript (javascript)
Use CSS selectors and the array:each
method
MooTools allows us to replace document.getElementsByTagName
with much more targeted CSS-based selector: $$("a[href^=mailto:]")
. This selector finds all links on the page whose href attribute begins with mailto:
, then places the results in an new array. This means we can ditch two elements of our original script: the call to
document.getElementsByTagName("a")
Code language: JavaScript (javascript)
and the if
syntax inside the loop:
if(links[i].href.indexOf("mailto:") !== -1)
Code language: JavaScript (javascript)
Next, we can replace the for
loop with an each
method, which performs whatever action is specified to each of the items in the array.
myArray.each(function (arrayitem){
//do something with arrayitem
});
Code language: JavaScript (javascript)
The each
array method is native to browsers not named Internet Explorer. Frameworks like MooTools and jQuery bring support for this function to browsers that don’t natively support it.
Now that we’ve got our CSS-based selector working with the each
method, we can greatly simplify our code:
window.addEvent("domready", function(){
var addressCleaner = function (){
this.href = this.href.split("?")[0].replace("notspam", "");
this.removeEvents({
"click": addressCleaner,
"contextmenu": addressCleaner
});
};
$("a[href^=mailto:]").each(function (a){
a.addEvents({
"click": addressCleaner,
"contextmenu": addressCleaner
});
});
});
Code language: JavaScript (javascript)
Tips
- You can place the dummy text in any part of your email address, not just the username portion. For instance, you could do
sales@visitSOMEWHEREOTHERTHANwaikiki.com
,sales@visitwaikiki.commie
, etc. - It’s probably a good idea to use dummy text other than the common phrase “nospam”; authors of spambot software can easily look for these phrases as keywords and use them to target your address. Get creative with your dummy text, just be sure it’s obvious to a human reader that the text needs to be removed.
- If you have multiple email addresses on the page, this method requires that you use the same dummy text in all email addresses.
- Be sure you change the dummy text in the JavaScript function to match whatever text you decide to use!
Known Issues
When JavaScript is disabled and someone copies/pastes the email address instead of clicking it, they will be copying the invalid version of the address. To minimize problems, you can write the address in a hard-to-miss way, such as using all caps for the dummy text (salesNOTSPAM@visitwaikiki.com
). This will be an extremely small percentage of users, so I wouldn’t worry too much; if they’re savvy enough to disable JavaScript and use copy/paste for email addresses, they’ll probably read the address, too.
This email address obfuscation method has been successfully tested in the following browser/OS combinations:
- Firefox 3.0 (Mac OS X, Windows Vista)
- Safari 3.2.1 (Mac OS X, Windows Vista)
- Internet Explorer 6 (Windows XP)
- Internet Explorer 7 (Windows Vista)
- Internet Explorer 8b1 (Windows 7 beta)
- Opera 9.6 (Mac OS X, Windows Vista)
- One issue in Opera: The contextmenu menu event doesn’t trigger correctly when right-clicking
Well done!
Nice to see that you follwed the same logical approach I did in year 2004 for email obfuscation.
We had to organise a big public event with online registration of participants. We couldn’t afford any spam on these email addresses.
Your concern to “make the link more usable when JavaScript is disabled” was solved in a similar way. The difference is that the user was suggested to activate JS via a fake href=”mailto:Activ@te.JavaScript.in.Browser.then.re-click.link”.
Strangely enough all the browsers accepted it as a “good email address”, so managing to make the alert.
I used onmouseover instead of onclick, and added also anonfocus=”this.onmouseover()” for preserving tabbed link navigation thru the page. At that time oncontextmenu was not invented yet.
About “NEVER use the real address…”: you can see we used a mixed approach. A fairly nice-looking email address, but with nuisances immediately suggesting the user to NOT dumb-copy-paste that text (and why should he do, since it’s much simpler to click or to do a contextmenu copy action?)
A page with such tricks still seats on:
http://web.archive.org/web/20041207040024/www.irc-irene.org/ecomondo2004/
Hello, thank you for this very helpful post. I use jQuery rather than MooTools and I ported your script over to jQuery so I thought I’d post it in case it’s of use to anyone else. I also added handlers for the ‘focus’ and ‘mouseover’ events so that the address appears correct in the status bar if users hover the mouse over the link or tab to it.
@jon thanks for sharing!