Obfuscating email addresses, revisited

A while back, I posted my method for defeating spambots that harvest email addresses. This post is an update to that original method. It explores cleaner, less obtrusive code approaches and more accessible/usable HTML markup.

If you’re impatient and want to jump to some working examples, here you go:

The other “solutions”

So how do you prevent spambots from harvesting your email address? Well, there are a gazillion suggestions out on the interwebs, and unfortunately most of them stink because they require JavaScript, and because they often use illegible or invalid markup. For instance, this example — which was created by an email address obfuscator ranked high in Google searches — uses character entities to render the text completely illegible:

<a href="&#x6d;&#97;&#000105;&#108;&#116;&#000111;&#58;&#000116;&#x68;&#x69;&#000115;&#64;&#x73;&#x74;&#000105;&#x6e;&#x6b;&#x73;&#x2e;&#00099;&#x6f;&#109;"

This method has been popular for a number of years, but has some serious flaws. First of all, how do you know if you even have the right address in there? Secondly, what’s to stop a spambot from reading character entities? I imagine it would be as easy as reading ASCII or UTF. GONG!

Here’s another popular approach, premised on the notion that spambots look for any links using a mailto: protocol:

<script type="text/javascript">
   function emailme(user, domain, suffix){
      var str = 'mai' + 'lto:' + user + '@' + domain + '.' + suffix;
      window.location.replace(str);
   }
</script>;
<a href="javascript:emailme('this','stinks','com')">this@stinks.com</a>

There are multiple problems with this approach. The first problem is that it doesn’t use mailto: in the markup. This means if JavaScript is disabled, the link is completely useless. It also breaks the sematics of the links.

The second problem is that the JavaScript is inline and therefore obtrusive. JavaScript should not be mingling with your markup… it’s bad form! Any link that starts with javascript: is troublesome in my book.

Lastly, the whole address is still contained in the text of the page. If a spambot is sophisticated enough to look for mailto: protocols, it’s probably sophisticated enough to use RegEx to search for text that uses both @ and a period (.) without spaces.

There are other solutions out there, too, but they all require invalid markup, semantically incorrect markup, or flat-out removal of the email hyperlink. I want a solution that remains clickable when JavaScript is disabled, and doesn’t get all screwy with the markup. These don’t fit the bill. There’s another way.

A cleaner solution

My solution is simple: use an invalid email address. No, really! An invalid address with some extra touches and some unobtrusive JavaScript will work wonders. Here’s how to use it, step-by step:

Step one: Create your markup using a slightly altered address

Begin with a real address, then modify it to include some dummy text. For instance, the address sales@visitwaikiki.com would be rewritten salesnotspam@visitwaikiki.com. The spambot will harvest the address salesnotspam@visitwaikiki.com, which won’t work when the spammers try to use it.

The markup should look like this:


<a href="mailto:salesnotspam@visitwaikiki.com">sales@visitwaikiki.com</a>

There’s an obvious flaw here: The email address is still written in plain text between the ‘a’ tags. We’ll need to use alternate text — if you want to avoid spambots, NEVER use the real address as the visible text in an email hyperlink.

Using something such as sales AT visitwaikiki DOT com is also probably a bad idea, simply because zealous spambot authors can look for that very common pattern and manage to parse the email address. You’re best off using a different phrase, such as:

<a href="mailto:janenotspam@visitwaikiki.com">Contact Jane.</a>
<a href="mailto:salesnotspam@visitwaikiki.com">Email our sales department.</a>

We still have another problem to address: The link works, but it’s using the wrong address! The next step will help with that.

Step two: Improve the markup to make the link more usable when JavaScript is disabled

It’s always a good idea to ensure your visitor can use the email hyperlink when JavaScript is disabled. As it stands, when the visitor clicks the link, their operating system will create an email addressed to the invalid address salesnotspam@visitwaikiki.com. Without JavaScript, we can’t correct the address, but we can let the user know that the address needs to be edited.

<a href="mailto:salesnotspam@visitwaikiki.com?subject=EMAIL ADDRESS NEEDS EDITING&body=Please remove the text 'notspam' from the address before sending your email.">
   Email our sales department.
</a>

The mailto: protocol allows users to tack on additional information using the subject and body options. Whatever is listed after subject will appear in the email’s subject line. Whatever is listed after body will appear in the message’s body. By creatively using these options in the email address, we can clearly instruct the visitor to edit the address as-needed. The code above this paragraph produces the following email when clicked:

To: salesnotspam@visitwaikiki.com
Subject: EMAIL ADDRESS NEEDS EDITING
Message: Please remove the text ‘notspam’ from the address before sending your email.

Is it a pain to have to include the subject and/or body options each time you write an address? Yes. But is it more of a pain than the hundreds of spam emails you might get each week? I doubt it.

We now have a fully-functioning standards-friendly markup-only spam-resistant link. (Yes, I love hyphens. Don’t you?) Next, we’ll improve the experience for the 95% or so of your visitors who have JavaScript enabled.

Step three: Use JavaScript to make the link behave normally for most visitors

Most of your visitors will have JavaScript enabled; let’s take advantage of this and improve their experience. Our primary goal with our script will be to correct the invalid address by removing the dummy text “notspam”. However, since we’re removing the dummy text, we’ll also need to remove the instructions contained in the subject and body options so we don’t confuse the visitor.

Here’s a simple function that scans the page for all email links, then removes the dummy text (assuming all links use the same dummy text), the subject option, and the body option:

onload approach

window.onload = function (){
   var links = document.getElementsByTagName("a");
   for (var i=0; i < links.length; i++){
      if(links[i].href.indexOf("mailto:") !== -1){
         this.href = this.href.split("?")[0].replace("notspam", "");
      }
   }
};

Live demo

This teeny bit of JavaScript executes when the page loads and makes all email links behave as expected. Now we have a fully-functioning standards-friendly spam-resistant email link that also degrades nicely for visitors without JavaScript, and looks/feels completely normal to everyone else.

However, if you’re paranoid like me, you’ll wonder: What if the spambot supports JavaScript and looks for email addresses after the page has loaded? Your email address would be just as vulnerable as it was before.

A quick tweak to the script can help: instead of cleaning the addresses when the page loads, we can choose to only clean an address when the link is clicked.

onclick approach

window.onload = function (){
   var addressCleaner = function (){
      this.href = this.href.split("?")[0].replace("notspam", "");
      this.onclick = function (){};
      this.oncontextmenu = function (){};
   };
   var links = document.getElementsByTagName("a");
   for (var i=0; i < links.length; i++){
      if(links[i].href.indexOf("mailto:") !== -1){
         links[i].onclick = addressCleaner;
         links[i].oncontextmenu = addressCleaner;
      }
   }
};

Live demo

Note: all modern browsers treat a link as ‘clicked’ if you tab to it and hit enter on your keyboard, which means the link remains accessible to those using keyboard navigation and/or screen readers.

Also, notice the oncontextmenu code; when a link is right-clicked, the onclick event isn’t triggered. If a person right-clicks the email address to copy it, they would be copying the invalid version of the address. Using the oncontextmenu event fixes this problem.

You’re done!

You now have a spam-resistant email hyperlink that works whether JavaScript is enabled or not. It adheres to standards (no invalid markup), is semantically correct, and is unobtrusive.

Having said that, you should be aware that this system is not perfect; spammers are very clever, and will always catch up to us. This method is a form of spam resistance, not a foolproof way to defeat all spambots from now until eternity.

While the code you’ve just seen will work fine for most people, there are a few improvements that can be made with the use of a JavaScript framework. If you don’t use a JavaScript framework such as MooTools or jQuery, your journey has ended. If you do use a framework, let’s explore some potential improvements to the system.

Improvements via frameworks

JavaScript frameworks add some impressive tools to our toolbox and provide many conveniences. For this example, I’m going to use MooTools 1.2, but most other frameworks will have similar code that you can adapt for your own needs. Here are some improvements we can make:

  1. Use event handlers instead of direct assignment.
  2. Use a domready event instead of window.onload.
  3. Use CSS selectors and the array:each method

Here’s the improved code, modified to use MooTools 1.2:

window.addEvent("domready", function(){
   var addressCleaner = function (){
      this.href = this.href.split("?")[0].replace("notspam", "");
      this.removeEvents({
         "click": addressCleaner,
         "contextmenu": addressCleaner
      });
   };
   $$("a[href^=mailto:]").each(function (a){
      a.addEvents({
         "click": addressCleaner,
         "contextmenu": addressCleaner
      });
   });
});

Live demo

Explanation of the MooTools framework version

Since some of you may not be familiar with frameworks, so I’ll try and explain the changes I’ve made.

Event handlers

Most JavaScript gurus will tell you that using event handlers is a much more robust approach than using a direct onclick assignment. For starters, adding an onclick event using direct assignment will overwrite any existing onclick event. Using an event handler will ensure the new event will not destroy any existing events, and will simply add the new event to a queue of events.

//Direct assignment
a.onclick = function (){
  //Do something
};

//MooTools event
a.addEvent("click", function (){
   //Do something
});

As you can imagine, if you don’t use a framework, browser support and cross-browser incompatibility issues make event handlers a bit of a pain. This is one of the primary reasons frameworks have become so popular: they take the pain out of cross-browser compatibility.

Change window.onload to a domready event

The domready event is executed earlier than an onload event. domready basically means that all markup has loaded into the browser DOM, even if images and other media haven’t finished downloading yet. onload, by comparison, only fires after everything has finished loading. A MooTools domready event looks like this:

window.addEvent("domready", function (){
   //do something
});

Use CSS selectors and the array:each method

MooTools allows us to replace document.getElementsByTagName with much more targeted CSS-based selector: $$("a[href^=mailto:]"). This selector finds all links on the page whose href attribute begins with mailto:, then places the results in an new array. This means we can ditch two elements of our original script: the call to

document.getElementsByTagName("a")

and the if syntax inside the loop:

if(links[i].href.indexOf("mailto:") !== -1)

Next, we can replace the for loop with an each method, which performs whatever action is specified to each of the items in the array.

myArray.each(function (arrayitem){
   //do something with arrayitem
});

The each array method is native to browsers not named Internet Explorer. Frameworks like MooTools and jQuery bring support for this function to browsers that don’t natively support it.

Now that we’ve got our CSS-based selector working with the each method, we can greatly simplify our code:

window.addEvent("domready", function(){
   var addressCleaner = function (){
      this.href = this.href.split("?")[0].replace("notspam", "");
      this.removeEvents({
         "click": addressCleaner,
         "contextmenu": addressCleaner
      });
   };
   $$("a[href^=mailto:]").each(function (a){
      a.addEvents({
         "click": addressCleaner,
         "contextmenu": addressCleaner
      });
   });
});

Tips

  • You can place the dummy text in any part of your email address, not just the username portion. For instance, you could do sales@visitSOMEWHEREOTHERTHANwaikiki.com, sales@visitwaikiki.commie, etc.
  • It’s probably a good idea to use dummy text other than the common phrase “nospam”; authors of spambot software can easily look for these phrases as keywords and use them to target your address. Get creative with your dummy text, just be sure it’s obvious to a human reader that the text needs to be removed.
  • If you have multiple email addresses on the page, this method requires that you use the same dummy text in all email addresses.
  • Be sure you change the dummy text in the JavaScript function to match whatever text you decide to use!

Known Issues

When JavaScript is disabled and someone copies/pastes the email address instead of clicking it, they will be copying the invalid version of the address. To minimize problems, you can write the address in a hard-to-miss way, such as using all caps for the dummy text (salesNOTSPAM@visitwaikiki.com). This will be an extremely small percentage of users, so I wouldn’t worry too much; if they’re savvy enough to disable JavaScript and use copy/paste for email addresses, they’ll probably read the address, too.

This email address obfuscation method has been successfully tested in the following browser/OS combinations:

  • Firefox 3.0 (Mac OS X, Windows Vista)
  • Safari 3.2.1 (Mac OS X, Windows Vista)
  • Internet Explorer 6 (Windows XP)
  • Internet Explorer 7 (Windows Vista)
  • Internet Explorer 8b1 (Windows 7 beta)
  • Opera 9.6 (Mac OS X, Windows Vista)
    • One issue in Opera: The contextmenu menu event doesn’t trigger correctly when right-clicking

New SCORM ebook coming soon!

I'm writing an ebook explaining how to build an HTML-based SCORM course. Subscribe to be notified when it's ready, as well as receive early bird pricing and some free goodies!

No spam, no sharing your email address, unsubscribe at any time. Powered by ConvertKit
Advertisements

3 Replies to “Obfuscating email addresses, revisited”

  1. Well done!
    Nice to see that you follwed the same logical approach I did in year 2004 for email obfuscation.
    We had to organise a big public event with online registration of participants. We couldn’t afford any spam on these email addresses.

    Your concern to “make the link more usable when JavaScript is disabled” was solved in a similar way. The difference is that the user was suggested to activate JS via a fake href=”mailto:Activ@te.JavaScript.in.Browser.then.re-click.link”.
    Strangely enough all the browsers accepted it as a “good email address”, so managing to make the alert.

    I used onmouseover instead of onclick, and added also anonfocus=”this.onmouseover()” for preserving tabbed link navigation thru the page. At that time oncontextmenu was not invented yet.

    About “NEVER use the real address…”: you can see we used a mixed approach. A fairly nice-looking email address, but with nuisances immediately suggesting the user to NOT dumb-copy-paste that text (and why should he do, since it’s much simpler to click or to do a contextmenu copy action?)

    A page with such tricks still seats on:
    http://web.archive.org/web/20041207040024/www.irc-irene.org/ecomondo2004/

  2. Hello, thank you for this very helpful post. I use jQuery rather than MooTools and I ported your script over to jQuery so I thought I’d post it in case it’s of use to anyone else. I also added handlers for the ‘focus’ and ‘mouseover’ events so that the address appears correct in the status bar if users hover the mouse over the link or tab to it.

    $(document).ready(function() {
        // This is based on:
        // http://pipwerks.com/2009/02/01/obfuscating-email-addresses-revisited/
        // by Philip Hutchison
        var events = 'click contextmenu focus mouseover';
        function addressCleaner() {
            this.href = this.href.split('?')[0].replace('notspam', '');
            $(this).unbind(events, addressCleaner);
        }
        $('a[href^="mailto:"]').bind(events, addressCleaner);
    });
    

Comments are closed.