Character sets (charsets) are utilized by browsers to convert information from stream of bytes into readable characters. Each character is represented by a value and each value has assigned corresponding character in a table. There are literally hundreds of the character encoding sets that are in use. Here is a list of just a few common character encoding used on the web ordered by popularity:
UTF-8 (Unicode) Covers: Worldwide
ISO-8859-1 (Latin alphabet part 1) Covers: North America, Western Europe, Latin America, the Caribbean, Canada, Africa
WINDOWS-1252 (Latin I)
ISO-8859-15 (Latin alphabet part 9) Covers: Similar to ISO 8859-1 but replaces some less common symbols with the euro sign and some other missing characters
ISO-8859-2 (Latin alphabet part 2) Covers: Eastern Europe
GB2312 (Chinese Simplified)
WINDOWS-1250 (Central Europe)
US-ASCII (basic English)
Note that popularity of particular charsets greatly depends on the geographical region. You can find all names for character encodings in the IANA registry.
As you can see there are multiple possibilities to choose from therefore character encoding information should always be specified in the HTTP Content-Type response headers send together with the document. Without specifying charset you risk that characters in your document will be incorrectly interpreted and displayed.
In Hypertext Transfer Protocol (HTTP) a header is simply a part of the message containing additional text fields that are send from or to the server. When browsers request a webpage, in addition to the HTML source code of a webpage the web server also sends fields containing various metadata describing settings and operational parameters of the response. In another words, the HTTP header is a set of fields containing supplemental information about the user request or server response.
From the example above, the “Response Headers” contain several fields with information about the server, content and encoding where the line
Content-Type: text/html; charset=utf-8
informs the browser that characters in the document are encoded using UTF-8 charset.
How to check for missing resources on the webpage?
The easiest way to check for missing resources on the webpage is to utilize your browser’s developer tools. Most modern browsers come with tool sets that allow to examine network traffic. The common way to access developer tools is to press “F12” button on your keyboard while browsing the webpage. My preferred way to analyze webpage resources is with Firebug which is a developer plugin for Firefox browser.
Why is it important to avoid bad requests?
First noticeable item in the traffic analysis of the page is the size of the 404 No Found responses which are not small in comparison to our tiny test page that is only 277 bytes. Depending on the server and website configuration the size of the error page will vary but it will usually be at least several kilobytes in size as the response usually will consists of headers and text or HTML code with the explanation of the error. If you have a fancy custom 404 error page which is large in size, the difference would be even more dramatic. Removing references to missing resources definitely will decrease bandwidth usage.
In April of 2010, Google announced that they include site speed as one of the signals in their search ranking algorithms. The reasoning behind that decision was rather simple “Faster sites create happy users”. Google, even did some research to back the obvious that users prefer faster websites. For a visitor of a website the speed is no doubt an important factor to the overall user experience, but according to the head of Google’s Webspam team, Matt Cutts, site speed plays only a minuscule role in Google search ranking algorithms and perishes in comparison to relevancy factors:
Google estimates that less than 1 out of 100 queries are impacted by the site speed factor. Nevertheless, a fast website is extremely important in a broader picture. Well optimized web pages preserve server resources, improve user experience and ensure that the website will not be penalized by Google’s page speed check.