Novice webmasters are often not aware of canonical URLs and how this can impact SEO efforts. Google calls the process of picking the best URL when there are several alternatives as canonicalization. Canonical URLs are URLs that appear the same and will probably produce the same content when typed into a browser.
After posting a blog comment on another blog, taking exception with a few statements the blog owner made regarding not needing SEO to rank in Google, one of the blog followers commented that several of the websites I own was suffering from canonicalization. I knew what canonicalization was with respect to domain name servers (DNS) as it was a technique that I had used for years setting up internet servers, and had assumed that it was similar in the SEO world.
For example, the following URLs would be considered canonical URLs:
www.barrywheeler.ca
barrywheeler.ca
www.barrywheeler.ca/index.php
These different URLs would all produce the same content in a web browser, technically they are different URLs. It is possible that these URLs could all return different content. Google attempts to pick the URL that it feels is the best represents the content. As a result when multiple URLs represent the same content, it is possible that Google will assess a duplicate content penalty on the canonical URLs.
If a webmaster has canonical URLs it means that any links from an external website may not be providing all the benefit for link popularity. This will dramatically impact the SEO and search results for a website.
Matt Cutts, head of Google’s WebSpam, suggest that webmasters deal with canonical URLs in a specific way. His first recommendation is to ensure that all links are consistent throughout a website. As well, Google Webmasters Tools allows webmasters to specify the preferred URL to use when displaying the results of searches, having all non-WWW URLs shown with the WWW.
It is possible to address canonical URLs using a 301 redirect. A 301 redirect is interpreted as “permanently moved” to the server, and will redirect to a single consolidated URL. This effectively consolidates all potential link popularity into single URLs. This increases the chances of pages obtaining and maintaining rankings for the keywords selected.
Given that my sites apparently suffered from canonical URLs, I was concerned about duplicate content penalties. The 301 redirect to correct the canonical URLs that existed across my sites appeared to be the quickest and easiest to implement.
Most web hosting accounts are configured to allow each domain hosted to have a set of unique options a domain or directory where it is placed. 301 redirects to address canonical URLs can be added to the configuration file that controls these configurations. This file is called the .htaccess file, and it is generally located in the same directory containing the website files.
For the domains that I run, I wanted to have all non-WWW URLs to be shown include the WWW. I had enabled this option in Google Webmaster Tools but had never considered the issue of canonicalization until it was pointed out to me.
To correct the canonical URLs within the domain “example.com” to have all URLs show as “www.example.com” the following lines would be added to the .htaccess file for example.com.
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
This would force all non-WWW URLs to include the WWW and would correct the issue of canonical URLs with this domain. It is also important to ensure that a backup is made for the original .htaccess file prior to making any changes.
How did I know that this worked for my domains?
I found a great tool for checking Server Headers and to view HTTP headers which would display if the page showed HTTP 200 OK, or 301 Permanently moved for the canonical URLs.
For my domain www.barrywheeler.ca the following showed:

Thus, the page was successfully found successfuly. Next, testing barrywheeler.ca, I should see the server header indicating that the page was permanently moved (301 redirect for the canonical URLs). In testing the the non-WWW version of barrywheeler.ca the following was returned by the tool.

The modifications to the .htaccess worked, and the issue of canonical URLs was corrected.
Remember, it is important to ensure that you avoid duplicate content penalties that may be assessed because of canonical URLs. It is also important to ensure that links pass their full value to the website.

{ 2 trackbacks }
{ 1 comment… read it below or add one }
I am unable to understand this post. But well some points are useful for me.