Email addresses that are publicly posted on the webpages in plain text, will very quickly be collected by spam bots and used to send unsolicited emails. To stop or at least make it more difficult for bulk emailers to collect publicly accessible emails, we can utilize some email obfuscation techniques. Protecting publicly displayed email addresses by obfuscating them, can not only cut down on spam but is also considered a courteous gesture.
There are several techniques at your disposal to obfuscate or in other words hide email addresses that are posted on publicly accessible webpages from spam bots.
1. Address Munging
Email address munging is a form of obfuscation where parts of the email address would be modified in such way that it would no longer look like an email but the end user (human) would still be able to reconstruct it.
Complexity of this approach really depends on human creativity. Some of the common techniques is to replace punctuation with words displaying “.” as “(dot)” and “@” as “at”. Other popular options are to separate words in the email with spaces, or perhaps adding a space between every character, or even reversing the entire email address.
Example: nospam@example.com
Good:
- Very easy to implement
- Cross browser compatible.
- No javascript required.
Bad:
- Easy for spam bots to circumvent.
- Does not work with MAILTO link.
- Not transparent – Visible to users.
- Users have to do the “decoding” part.
Sometimes this is the only way to obfuscate emails in chat rooms or article comments but it is not advisable to use it in webpages.
2. Unicode Encoding
In an essence, Unicode is a table of number for every letter or character regardless of language, platform or software. Each character is assigned a code, so the idea is to substitute visible characters with their corresponding numbers from the Unicode table. Below is a basic table with most common (ASCII) characters.
Decimal Code | HTML encoding | Character | Name |
32 |   | SPACE | |
33 | ! | ! | EXCLAMATION MARK |
34 | " | “ | QUOTATION MARK |
35 | # | # | NUMBER SIGN |
36 | $ | $ | DOLLAR SIGN |
37 | % | % | PERCENT SIGN |
38 | & | & | AMPERSAND |
39 | ' | ‘ | APOSTROPHE |
40 | ( | ( | LEFT PARENTHESIS |
41 | ) | ) | RIGHT PARENTHESIS |
42 | * | * | ASTERISK |
43 | + | + | PLUS SIGN |
44 | , | , | COMMA |
45 | - | – | HYPHEN-MINUS |
46 | . | . | FULL STOP |
47 | / | / | SOLIDUS |
48 | 0 | 0 | DIGIT ZERO |
49 | 1 | 1 | DIGIT ONE |
50 | 2 | 2 | DIGIT TWO |
51 | 3 | 3 | DIGIT THREE |
52 | 4 | 4 | DIGIT FOUR |
53 | 5 | 5 | DIGIT FIVE |
54 | 6 | 6 | DIGIT SIX |
55 | 7 | 7 | DIGIT SEVEN |
56 | 8 | 8 | DIGIT EIGHT |
57 | 9 | 9 | DIGIT NINE |
58 | : | : | COLON |
59 | ; | ; | SEMICOLON |
60 | < | < | LESS-THAN SIGN |
61 | = | = | EQUALS SIGN |
62 | > | > | GREATER-THAN SIGN |
63 | ? | ? | QUESTION MARK |
64 | @ | @ | COMMERCIAL AT |
65 | A | A | LATIN CAPITAL LETTER A |
66 | B | B | LATIN CAPITAL LETTER B |
67 | C | C | LATIN CAPITAL LETTER C |
68 | D | D | LATIN CAPITAL LETTER D |
69 | E | E | LATIN CAPITAL LETTER E |
70 | F | F | LATIN CAPITAL LETTER F |
71 | G | G | LATIN CAPITAL LETTER G |
72 | H | H | LATIN CAPITAL LETTER H |
73 | I | I | LATIN CAPITAL LETTER I |
74 | J | J | LATIN CAPITAL LETTER J |
75 | K | K | LATIN CAPITAL LETTER K |
76 | L | L | LATIN CAPITAL LETTER L |
77 | M | M | LATIN CAPITAL LETTER M |
78 | N | N | LATIN CAPITAL LETTER N |
79 | O | O | LATIN CAPITAL LETTER O |
80 | P | P | LATIN CAPITAL LETTER P |
81 | Q | Q | LATIN CAPITAL LETTER Q |
82 | R | R | LATIN CAPITAL LETTER R |
83 | S | S | LATIN CAPITAL LETTER S |
84 | T | T | LATIN CAPITAL LETTER T |
85 | U | U | LATIN CAPITAL LETTER U |
86 | V | V | LATIN CAPITAL LETTER V |
87 | W | W | LATIN CAPITAL LETTER W |
88 | X | X | LATIN CAPITAL LETTER X |
89 | Y | Y | LATIN CAPITAL LETTER Y |
90 | Z | Z | LATIN CAPITAL LETTER Z |
91 | [ | [ | LEFT SQUARE BRACKET |
92 | \ | \ | REVERSE SOLIDUS |
93 | ] | ] | RIGHT SQUARE BRACKET |
94 | ^ | ^ | CIRCUMFLEX ACCENT |
95 | _ | _ | LOW LINE |
96 | ` | ` | GRAVE ACCENT |
97 | a | a | LATIN SMALL LETTER A |
98 | b | b | LATIN SMALL LETTER B |
99 | c | c | LATIN SMALL LETTER C |
100 | d | d | LATIN SMALL LETTER D |
101 | e | e | LATIN SMALL LETTER E |
102 | f | f | LATIN SMALL LETTER F |
103 | g | g | LATIN SMALL LETTER G |
104 | h | h | LATIN SMALL LETTER H |
105 | i | i | LATIN SMALL LETTER I |
106 | j | j | LATIN SMALL LETTER J |
107 | k | k | LATIN SMALL LETTER K |
108 | l | l | LATIN SMALL LETTER L |
109 | m | m | LATIN SMALL LETTER M |
110 | n | n | LATIN SMALL LETTER N |
111 | o | o | LATIN SMALL LETTER O |
112 | p | p | LATIN SMALL LETTER P |
113 | q | q | LATIN SMALL LETTER Q |
114 | r | r | LATIN SMALL LETTER R |
115 | s | s | LATIN SMALL LETTER S |
116 | t | t | LATIN SMALL LETTER T |
117 | u | u | LATIN SMALL LETTER U |
118 | v | v | LATIN SMALL LETTER V |
119 | w | w | LATIN SMALL LETTER W |
120 | x | x | LATIN SMALL LETTER X |
121 | y | y | LATIN SMALL LETTER Y |
122 | z | z | LATIN SMALL LETTER Z |
123 | { | { | LEFT CURLY BRACKET |
124 | | | | | VERTICAL LINE |
125 | } | } | RIGHT CURLY BRACKET |
126 | ~ | ~ | TILDE |
Using this table we can substitute each character in the nospam@example.com email with
However a browser will convert those numbers back to their representation and display it to the users. There are many Unicode encoders out there making it a quick an easy solution.
Good:
- Easy to implement
- Works well with MAILTO link.
- Cross browser compatible.
- No javascript required.
- Transparent to users.
Bad:
- Very easy for spam bots to circumvent.
- Low level of protection.
Good in environment where email must be displayed and javascript or css is not supported. Better than nothing but overall too easy for spam bots to circumvent.
3. Html comments
In html comments are entered between “<!– … –>” tag. Since comments do not render in browser you can interject them into the email address to throw spam bots off.
Example: nospam@example.com
Good:
- Very easy to implement
- Cross browser compatible
- No javascript required.
- Transparent to users.
Bad:
- Very easy for spam bots to circumvent
- Does not work with MAILTO link
Good if there is no javascript or css supported. Overall this option is not recommended as it is too easy to circumvent.
4. CSS – content property
The CSS “content” property can be utilized with either “:before” or “:after” pseudo-elements to insert content into an HTML element.
Example: nospam@example.com
This method can be more effective if included in external css file.
Good:
- Easy to implement
- No javascript required.
- Transparent to users.
Bad:
- Does not work with IE8 or older.
5. CSS – direction property
The direction property in CSS relates to direction of the text. For some languages such as Hebrew or Arabic the direction of the text is right to left (rtl) where for others is left to right (ltr). We can write email addresses in reverse and together with unicode-bidi property we can utilize it to overwrite and change the direction of the email address when rendered in a browser.
Example: nospam@example.com
As with css content property method this will be more effective when included in external stylesheet.
Good:
- Easy to implement
- No javascript required.
- Transparent to users.
Bad:
- Does not work with IE8 or older.
6. CSS – display property
The display property allows specifying “none” which means that the element will not be outputted when rendered by browser. With this in mind we can inject various html tags into the email address making it more difficult to extract.
Example: nospam@example.com
Good:
- Easy to implement
- No javascript required.
- Transparent to users.
Bad:
- Does not work with MAILTO link
7. Javascript
There are multiple ways of using javascript to output text but the easiest is the document.write() method. We can basically output any html or text to the browser.
Example: nospam@example.com
Good:
- Transparent to users.
- Excellent obfuscation level.
- Can be used with MAILTO link.
Bad:
- Requires javascript enabled in browsers.
- Some WYSIWYG editors can restrict javascript making it more difficult to implement.
8. ROT13 Encryption
Rot13 is a cipher which is based on rotation of alphabetic characters by 13. The numeric and none alphabetic characters remain unchanged. Because there is 26 letters in English alphabet rotating characters by 13 encodes and decodes the string. So in our test email first letter of our email is “n” which would become “a” since we would count 13 letters from “n” in the alphabet in circle.
Example: nospam@example.com
Good:
- Transparent to users.
- Good obfuscation level.
- Can be used with MAILTO link.
- Can be used in chats and forums
Bad:
- Requires javascript enabled in browsers for automatic decoding.
- Some WYSIWYG editors can restrict javascript making it more difficult to implement.