8 Email Obfuscation Techniques

Published to Security, Tools on by .

email-phishingEmail addresses that are publicly posted on the webpages in plain text, will very quickly be collected by spam bots and used to send unsolicited emails. To stop or at least make it more difficult for bulk emailers to collect publicly accessible emails, we can utilize some email obfuscation techniques. Protecting publicly displayed email addresses by obfuscating them, can not only cut down on spam but is also considered a courteous gesture.

There are several techniques at your disposal to obfuscate or in other words hide email addresses that are posted on publicly accessible webpages from spam bots.

1. Address Munging

Email address munging is a form of obfuscation where parts of the email address would be modified in such way that it would no longer look like an email but the end user (human) would still be able to reconstruct it.

Complexity of this approach really depends on human creativity.  Some of the common techniques is to replace punctuation with words displaying “.” as “(dot)” and “@” as “at”.  Other popular options are to separate words in the email with spaces, or perhaps adding a space between every character, or even reversing the entire email address.

Example: nospam@example.com

no spam at example (dot) com

Good:

  • Very easy to implement
  • Cross browser compatible.
  • No javascript required.

Bad:

  • Easy for spam bots to circumvent.
  • Does not work with MAILTO link.
  • Not transparent – Visible to users.
  • Users have to do the “decoding” part.

Sometimes this is the only way to obfuscate emails in chat rooms or article comments but it is not advisable to use it in webpages.

 

2. Unicode Encoding

In an essence, Unicode is a table of number for every letter or character regardless of language, platform or software.  Each character is assigned a code, so the idea is to substitute visible characters with their corresponding numbers from the Unicode table.  Below is a basic table with most common (ASCII) characters.

Decimal Code HTML encoding Character Name
32   SPACE
33 ! ! EXCLAMATION MARK
34 " QUOTATION MARK
35 # # NUMBER SIGN
36 $ $ DOLLAR SIGN
37 % % PERCENT SIGN
38 & & AMPERSAND
39 ' APOSTROPHE
40 ( ( LEFT PARENTHESIS
41 ) ) RIGHT PARENTHESIS
42 * * ASTERISK
43 + + PLUS SIGN
44 , , COMMA
45 - HYPHEN-MINUS
46 . . FULL STOP
47 / / SOLIDUS
48 0 0 DIGIT ZERO
49 1 1 DIGIT ONE
50 2 2 DIGIT TWO
51 3 3 DIGIT THREE
52 4 4 DIGIT FOUR
53 5 5 DIGIT FIVE
54 6 6 DIGIT SIX
55 7 7 DIGIT SEVEN
56 8 8 DIGIT EIGHT
57 9 9 DIGIT NINE
58 : : COLON
59 &#59; ; SEMICOLON
60 &#60; < LESS-THAN SIGN
61 &#61; = EQUALS SIGN
62 &#62; > GREATER-THAN SIGN
63 &#63; ? QUESTION MARK
64 &#64; @ COMMERCIAL AT
65 &#65; A LATIN CAPITAL LETTER A
66 &#66; B LATIN CAPITAL LETTER B
67 &#67; C LATIN CAPITAL LETTER C
68 &#68; D LATIN CAPITAL LETTER D
69 &#69; E LATIN CAPITAL LETTER E
70 &#70; F LATIN CAPITAL LETTER F
71 &#71; G LATIN CAPITAL LETTER G
72 &#72; H LATIN CAPITAL LETTER H
73 &#73; I LATIN CAPITAL LETTER I
74 &#74; J LATIN CAPITAL LETTER J
75 &#75; K LATIN CAPITAL LETTER K
76 &#76; L LATIN CAPITAL LETTER L
77 &#77; M LATIN CAPITAL LETTER M
78 &#78; N LATIN CAPITAL LETTER N
79 &#79; O LATIN CAPITAL LETTER O
80 &#80; P LATIN CAPITAL LETTER P
81 &#81; Q LATIN CAPITAL LETTER Q
82 &#82; R LATIN CAPITAL LETTER R
83 &#83; S LATIN CAPITAL LETTER S
84 &#84; T LATIN CAPITAL LETTER T
85 &#85; U LATIN CAPITAL LETTER U
86 &#86; V LATIN CAPITAL LETTER V
87 &#87; W LATIN CAPITAL LETTER W
88 &#88; X LATIN CAPITAL LETTER X
89 &#89; Y LATIN CAPITAL LETTER Y
90 &#90; Z LATIN CAPITAL LETTER Z
91 &#91; [ LEFT SQUARE BRACKET
92 &#92; \ REVERSE SOLIDUS
93 &#93; ] RIGHT SQUARE BRACKET
94 &#94; ^ CIRCUMFLEX ACCENT
95 &#95; _ LOW LINE
96 &#96; ` GRAVE ACCENT
97 &#97; a LATIN SMALL LETTER A
98 &#98; b LATIN SMALL LETTER B
99 &#99; c LATIN SMALL LETTER C
100 &#100; d LATIN SMALL LETTER D
101 &#101; e LATIN SMALL LETTER E
102 &#102; f LATIN SMALL LETTER F
103 &#103; g LATIN SMALL LETTER G
104 &#104; h LATIN SMALL LETTER H
105 &#105; i LATIN SMALL LETTER I
106 &#106; j LATIN SMALL LETTER J
107 &#107; k LATIN SMALL LETTER K
108 &#108; l LATIN SMALL LETTER L
109 &#109; m LATIN SMALL LETTER M
110 &#110; n LATIN SMALL LETTER N
111 &#111; o LATIN SMALL LETTER O
112 &#112; p LATIN SMALL LETTER P
113 &#113; q LATIN SMALL LETTER Q
114 &#114; r LATIN SMALL LETTER R
115 &#115; s LATIN SMALL LETTER S
116 &#116; t LATIN SMALL LETTER T
117 &#117; u LATIN SMALL LETTER U
118 &#118; v LATIN SMALL LETTER V
119 &#119; w LATIN SMALL LETTER W
120 &#120; x LATIN SMALL LETTER X
121 &#121; y LATIN SMALL LETTER Y
122 &#122; z LATIN SMALL LETTER Z
123 &#123; { LEFT CURLY BRACKET
124 &#124; | VERTICAL LINE
125 &#125; } RIGHT CURLY BRACKET
126 &#126; ~ TILDE

 

Using this table we can substitute each character in the nospam@example.com email with

&#110;&#111;&#115;&#112;&#97;&#109;&#64;&#101;&#120;&#97;&#109;&#112;&#108;&#101;&#46;&#99;&#111;&#109;

However a browser will convert those numbers back to their representation and display it to the users.  There are many Unicode encoders out there making it a quick an easy solution.

Good:

  • Easy to implement
  • Works well with MAILTO link.
  • Cross browser compatible.
  • No javascript required.
  • Transparent to users.

Bad:

  • Very easy for spam bots to circumvent.
  • Low level of protection.

Good in environment where email must be displayed and javascript or css is not supported. Better than nothing but overall too easy for spam bots to circumvent.

 

3. Html comments

In html comments are entered between “<!– … –>”  tag. Since comments do not render in browser you can interject them into the email address to throw spam bots off.

Example: nospam@example.com

no<!– spam –>spam<!– @ –>@<!– –>example<!– . –>.<!– spam. –>com

Good:

  • Very easy to implement
  • Cross browser compatible
  • No javascript required.
  • Transparent to users.

Bad:

  • Very easy for spam bots to circumvent
  • Does not work with MAILTO link

Good if there is no javascript or css supported. Overall this option is not recommended as it is too easy to circumvent.

 

4. CSS – content property

The CSS “content” property can be utilized with either “:before” or  “:after” pseudo-elements to insert content into an HTML element.

Example: nospam@example.com

<style type=”text/css”>
span.email:after { content: “nospam\40example.com”; }
</style><span class=”email”></span>

This method can be more effective if included in external css file.

Good:

  • Easy to implement
  • No javascript required.
  • Transparent to users.

Bad:

  • Does not work with IE8 or older.

 

5. CSS – direction property

The direction property in CSS relates to direction of the text. For some languages such as Hebrew or Arabic the direction of the text is right to left (rtl) where for others is left to right (ltr). We can write email addresses in reverse and together with unicode-bidi property we can utilize it to overwrite and change the direction of the email address when rendered in a browser.

Example: nospam@example.com

<style type=”text/css”>
span.reverse { unicode-bidi:bidi-override; direction: rtl; }
</style><span>moc.elpmaxe@mapson</span>

As with css content property method this will be more effective when included in external stylesheet.

Good:

  • Easy to implement
  • No javascript required.
  • Transparent to users.

Bad:

  • Does not work with IE8 or older.

 

6. CSS – display property

The display property allows specifying “none” which means that the element will not be outputted when rendered by browser. With this in mind we can inject various html tags into the email address making it more difficult to extract.

Example: nospam@example.com

<style type=”text/css”>
.hide { display:none; }
</style>nospam<span class=”hide”>null</span>@<p>null</p>example.com

Good:

  • Easy to implement
  • No javascript required.
  • Transparent to users.

Bad:

  • Does not work with MAILTO link

 

7. Javascript

There are multiple ways of using javascript to output text but the easiest is the document.write() method. We can basically output any html or text to the browser.

Example: nospam@example.com

<script language=”JavaScript” type=”text/javascript”>
<!–var s1 = “nospam”;

var s2 = “@”;

var s3 = “example.com”;

document.write(s1 + s2 + s3);

//–>
</script>

Good:

  • Transparent to users.
  • Excellent obfuscation level.
  • Can be used with MAILTO link.

Bad:

  • Requires javascript enabled in browsers.
  • Some WYSIWYG editors can restrict javascript making it more difficult to implement.

 

8. ROT13 Encryption

Rot13 is a cipher which is based on rotation of alphabetic characters by 13. The numeric and none alphabetic characters remain unchanged. Because there is 26 letters in English alphabet rotating characters by 13 encodes and decodes the string.  So in our test email first letter of our email is “n” which would become “a” since we would count 13 letters from “n” in the alphabet in circle.

Example: nospam@example.com

<script language=”JavaScript” type=”text/javascript”>
<!–function str_rot13(str) {

return (str + ”)

.replace(/[a-z]/gi, function(s) {

return String.fromCharCode(s.charCodeAt(0) + (s.toLowerCase() < ‘n’ ? 13 : -13));

});

}

document.write(str_rot13(‘abfcnz5@rknzcyr.pbz’));

//–>
</script>

Good:

  • Transparent to users.
  • Good obfuscation level.
  • Can be used with MAILTO link.
  • Can be used in chats and forums

Bad:

  • Requires javascript enabled in browsers for automatic decoding.
  • Some WYSIWYG editors can restrict javascript making it more difficult to implement.