A while back, I posted my method for defeating spambots that harvest email addresses. This post is an update to that original method. It explores cleaner, less obtrusive code approaches and more accessible/usable HTML markup.
If you’re impatient and want to jump to some working examples, here you go:
The other “solutions”
<a href="mailto:email@example.com"Code language: HTML, XML (xml)
This method has been popular for a number of years, but has some serious flaws. First of all, how do you know if you even have the right address in there? Secondly, what’s to stop a spambot from reading character entities? I imagine it would be as easy as reading ASCII or UTF. GONG!
Here’s another popular approach, premised on the notion that spambots look for any links using a
There are multiple problems with this approach. The first problem is that it doesn’t use
Lastly, the whole address is still contained in the text of the page. If a spambot is sophisticated enough to look for
mailto: protocols, it’s probably sophisticated enough to use RegEx to search for text that uses both @ and a period (.) without spaces.
A cleaner solution
Step one: Create your markup using a slightly altered address
Begin with a real address, then modify it to include some dummy text. For instance, the address
firstname.lastname@example.org would be rewritten
email@example.com. The spambot will harvest the address
firstname.lastname@example.org, which won’t work when the spammers try to use it.
The markup should look like this:
<a href="mailto:email@example.com">firstname.lastname@example.org</a>Code language: HTML, XML (xml)
There’s an obvious flaw here: The email address is still written in plain text between the ‘a’ tags. We’ll need to use alternate text — if you want to avoid spambots, NEVER use the real address as the visible text in an email hyperlink.
Using something such as sales AT visitwaikiki DOT com is also probably a bad idea, simply because zealous spambot authors can look for that very common pattern and manage to parse the email address. You’re best off using a different phrase, such as:
<a href="mailto:email@example.com">Contact Jane.</a> <a href="mailto:firstname.lastname@example.org">Email our sales department.</a>Code language: HTML, XML (xml)
We still have another problem to address: The link works, but it’s using the wrong address! The next step will help with that.
<a href="mailto:email@example.com?subject=EMAIL ADDRESS NEEDS EDITING&body=Please remove the text 'notspam' from the address before sending your email."> Email our sales department. </a>Code language: HTML, XML (xml)
mailto: protocol allows users to tack on additional information using the subject and body options. Whatever is listed after subject will appear in the email’s subject line. Whatever is listed after body will appear in the message’s body. By creatively using these options in the email address, we can clearly instruct the visitor to edit the address as-needed. The code above this paragraph produces the following email when clicked:
Subject: EMAIL ADDRESS NEEDS EDITING
Message: Please remove the text ‘notspam’ from the address before sending your email.
Is it a pain to have to include the subject and/or body options each time you write an address? Yes. But is it more of a pain than the hundreds of spam emails you might get each week? I doubt it.
Here’s a simple function that scans the page for all email links, then removes the dummy text (assuming all links use the same dummy text), the subject option, and the body option:
A quick tweak to the script can help: instead of cleaning the addresses when the page loads, we can choose to only clean an address when the link is clicked.
Note: all modern browsers treat a link as ‘clicked’ if you tab to it and hit enter on your keyboard, which means the link remains accessible to those using keyboard navigation and/or screen readers.
Also, notice the
oncontextmenu code; when a link is right-clicked, the
onclick event isn’t triggered. If a person right-clicks the email address to copy it, they would be copying the invalid version of the address. Using the
oncontextmenu event fixes this problem.
Having said that, you should be aware that this system is not perfect; spammers are very clever, and will always catch up to us. This method is a form of spam resistance, not a foolproof way to defeat all spambots from now until eternity.
Improvements via frameworks
- Use event handlers instead of direct assignment.
- Use a domready event instead of
- Use CSS selectors and the
Here’s the improved code, modified to use MooTools 1.2:
Explanation of the MooTools framework version
Since some of you may not be familiar with frameworks, so I’ll try and explain the changes I’ve made.
onclick assignment. For starters, adding an
onclick event using direct assignment will overwrite any existing
onclick event. Using an event handler will ensure the new event will not destroy any existing events, and will simply add the new event to a queue of events.
As you can imagine, if you don’t use a framework, browser support and cross-browser incompatibility issues make event handlers a bit of a pain. This is one of the primary reasons frameworks have become so popular: they take the pain out of cross-browser compatibility.
window.onload to a
domready event is executed earlier than an
domready basically means that all markup has loaded into the browser DOM, even if images and other media haven’t finished downloading yet.
onload, by comparison, only fires after everything has finished loading. A MooTools
domready event looks like this:
Use CSS selectors and the
MooTools allows us to replace
document.getElementsByTagName with much more targeted CSS-based selector:
$$("a[href^=mailto:]"). This selector finds all links on the page whose href attribute begins with
mailto:, then places the results in an new array. This means we can ditch two elements of our original script: the call to
if syntax inside the loop:
Next, we can replace the
for loop with an
each method, which performs whatever action is specified to each of the items in the array.
each array method is native to browsers not named Internet Explorer. Frameworks like MooTools and jQuery bring support for this function to browsers that don’t natively support it.
Now that we’ve got our CSS-based selector working with the
each method, we can greatly simplify our code:
- You can place the dummy text in any part of your email address, not just the username portion. For instance, you could do
- It’s probably a good idea to use dummy text other than the common phrase “nospam”; authors of spambot software can easily look for these phrases as keywords and use them to target your address. Get creative with your dummy text, just be sure it’s obvious to a human reader that the text needs to be removed.
- If you have multiple email addresses on the page, this method requires that you use the same dummy text in all email addresses.
This email address obfuscation method has been successfully tested in the following browser/OS combinations:
- Firefox 3.0 (Mac OS X, Windows Vista)
- Safari 3.2.1 (Mac OS X, Windows Vista)
- Internet Explorer 6 (Windows XP)
- Internet Explorer 7 (Windows Vista)
- Internet Explorer 8b1 (Windows 7 beta)
- Opera 9.6 (Mac OS X, Windows Vista)
- One issue in Opera: The contextmenu menu event doesn’t trigger correctly when right-clicking