Sitemaps are often ignored by webmasters.
Their value for both visitor-targeted and spider-targeted optimization is
underestimated.
What is a sitemap? In the most general terms, it's a page or pages that contain a list of and link to all the other documents on your site. Theoretically, it's designed to give your visitors a quick way to find what they are looking for on your site without browsing the entire content. A sitemap also aims at eliminating the need to link to every page of your site from your home page.
In the last few years, sitemaps have gained importance as a SEO factor: they can be utilized to direct the search engine spiders to all of your content-rich pages. This is especially true for large sites, where a number of clicks are needed to get to specific pages through the numerous sections and subsections. If a site has thousands of pages, its webmaster should really consider dividing it into sections to make navigation easy. This can also mean that a search engine crawler needs to do a lot of work to find all of the pages. With a sitemap, spiders feel much more "relaxed".
Mainly, a sitemap is important because of the following reasons:
Here we list our collection of tips on how to build an effective sitemap.
A search
engine-friendly and visitor-friendly sitemap
Your sitemap must only be
linked to from your homepage and no other page, because: a) you want the search
engine spiders to find this link directly from your homepage and follow it from
there and b) according to the PageRank distribution concerns, linking to your
sitemap from only your home page will spread the PageRank quickly to pages all
over your site.
If you have a large website of 50 pages or more, limit the number of pages listed on your sitemap to a maximum of 30, otherwise it can be mistaken for a link farm by the search engines. Limiting the number of entries to 30 also makes a map much easier for real human visitors to read. This step may mean splitting your sitemap over several pages – don't be afraid of that, just make sure each of your sitemap pages links to the next. Otherwise both visitors and search engine spiders will find a broken link, lose interest and go away.
The title of each sitemap link should be keyword rich and link directly back to the original page. Always link from your sitemap to your pages using the anchor text that will help those pages with their rankings (i.e. use the keywords for link text that the page you're linking to is optimized for). Include around 10 – 20 words of textual content from the original page underneath each sitemap link. This creates more content for search engine spiders and human visitors can see exactly what each page is about before clicking. Besides, descriptions help bring the keyword density of the map down to an acceptable level, should this level be exceeded.
Ensure that the look and feel of your sitemap page is consistent with the rest of your site. Use the same basic HTML template you used for every other page of your site.
As a solution for the problem of crawling big websites Google has suggested its Sitemap program. Google claims that its Sitemap Technology was created in order to list sites much faster (http://www.google.com/webmasters/sitemaps/docs/en/about.html). The idea is that you inform Google about your site, the quantity of pages, the frequency of updates and their regular or irregular basis. It also gives you the ability to see your site from the Google's point of view, i.e. learn about errors
(https://www.google.com/support/webmasters/bin/answer.py?answer=35120&topic=8474). It is free and gives you an opportunity to index new or changed pages on-the-fly if they conform to Google standards.
And what does this mean in practice? You place the Urls of your pages and point how Google should index them into an XML document. The crawler reads this information and if the pages correspond with Google standards they are indexed very quickly.
Here you can observe a simple example of a sitemap file:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<url>
<loc>http://www.yoursite.com/</loc>
<priority>1.0</priority>
<lastmod>2005-07-03T16:18:09+00:00</lastmod>
<changefreq>daily</changefreq>
</url>
</urlset>
There are several important variables here. <priority> explains Google spider the order of the pages' indexation. 1.0 is the highest priority, 0.0 is the lowest. By default it is recommended to be set at 0.5 if you don't what to specify it. If some pages are more important than others then higher priority will increase their importance in Google.<lastmod> means last modified and prevents spiders from recrawling pages that have not changed since last modification. <changefreq> means change frequency. This parameter can be very important if your update your pages frequently. Such modes are available - always, hourly, daily, weekly, monthly, yearly and never.
For constant update of your site in Google's massive index or database you should have a Generator that will spider your site, list the urls and export this information to Google. Don't forget that the option of a simple text file submitting is also offered by Google.
There are many generators and different ways to build an XML sitemap file but the most common are as follows:
“Google's Python Generator
That's a relative term, if you know your server like the back of your hand and installing scripts doesn't scare the bejesus out of you, you're probably smiling at the word difficult. Google supplies a link to a generator which you can download and set up on your server. It will cough up your sitemap XML file and automatically feed it to Google.
In order for this Generator to work, Python version 2.2 must be installed on your Web server - many servers don't have this. If you know what you're doing, this will probably be a good choice.
You don't need a Google Account to use Sitemaps but it's encouraged because you can track your sitemap's progress and view diagnostic information. If you already have another Google Account - gmail, Google Alerts, etc. just use that one to sign in and follow directions from there.
To submit your Sitemap using an HTTP request, issue your request to the following URL:
www.google.com/webmasters/sitemaps/ping?sitemap=sitemap_url
A PHP Code Generator
This is a php generator that you can place on your server. This generator will spider your site and produce your XML sitemap file. Download the phpSitemapNG and upload it your server. Run the generator to get your XML sitemap file and send it to Google.
Again, this is only hard to do if you don't know your way around PHP files or scripts.
Free Online Generator
These Generators are popping up everywhere and Google now keeps a list of these 'third party suppliers' of generators on their site. Find them here: http://code.google.com/sm_thirdparty.html.
One of the easiest to use is http://www.xml-sitemaps.com and you can index up to 500 pages with this online Generator very quickly and it will give you the sitemap XML file Google needs to index your site. It will go into your site, spider it and index all your pages into an XML sitemap of your site. You can download this file, compressed or non- compressed and make minor changes such as setting the priority, changing frequency, etc.
Then upload this file to your site as sitemap.xml to the root directory of your server i.e. where you have your homepage. Then notify Google Sitemaps of your XML file and you're in business.
What is a sitemap? In the most general terms, it's a page or pages that contain a list of and link to all the other documents on your site. Theoretically, it's designed to give your visitors a quick way to find what they are looking for on your site without browsing the entire content. A sitemap also aims at eliminating the need to link to every page of your site from your home page.
In the last few years, sitemaps have gained importance as a SEO factor: they can be utilized to direct the search engine spiders to all of your content-rich pages. This is especially true for large sites, where a number of clicks are needed to get to specific pages through the numerous sections and subsections. If a site has thousands of pages, its webmaster should really consider dividing it into sections to make navigation easy. This can also mean that a search engine crawler needs to do a lot of work to find all of the pages. With a sitemap, spiders feel much more "relaxed".
Mainly, a sitemap is important because of the following reasons:
- It helps ensure that all of your
content-rich pages are exposed to the search engine spider. With lots of
pages and a deep link structure, the crawlers would need to work hard to
find all of your pages. When you give them one single page which maps to
all the necessary content, you make their job easier and ensure that
nothing gets missed.
- It gives you a way to spread
Google's PageRank (covered in one of the future lessons) to the pages that
need it and distribute it among these pages. PageRank is a very important
part of Google's algorithm but sending it to the right pages may become a
headache. So, instead of filling up your home page with internal links,
you can use the sitemap to do the job.
- Sitemap can be used for a more
advanced PageRank distribution. Say your site has an "About Us"
page which is solely designed for your visitors and not targeted at the
spiders. If you link to it from every page on your site, such behavior
will affect the rankings of your important pages. Instead, you can link to
it from your sitemap and only have your important pages linking to each
other.
Here we list our collection of tips on how to build an effective sitemap.
A search
engine-friendly and visitor-friendly sitemap
Your sitemap must only be
linked to from your homepage and no other page, because: a) you want the search
engine spiders to find this link directly from your homepage and follow it from
there and b) according to the PageRank distribution concerns, linking to your
sitemap from only your home page will spread the PageRank quickly to pages all
over your site. If you have a large website of 50 pages or more, limit the number of pages listed on your sitemap to a maximum of 30, otherwise it can be mistaken for a link farm by the search engines. Limiting the number of entries to 30 also makes a map much easier for real human visitors to read. This step may mean splitting your sitemap over several pages – don't be afraid of that, just make sure each of your sitemap pages links to the next. Otherwise both visitors and search engine spiders will find a broken link, lose interest and go away.
The title of each sitemap link should be keyword rich and link directly back to the original page. Always link from your sitemap to your pages using the anchor text that will help those pages with their rankings (i.e. use the keywords for link text that the page you're linking to is optimized for). Include around 10 – 20 words of textual content from the original page underneath each sitemap link. This creates more content for search engine spiders and human visitors can see exactly what each page is about before clicking. Besides, descriptions help bring the keyword density of the map down to an acceptable level, should this level be exceeded.
Ensure that the look and feel of your sitemap page is consistent with the rest of your site. Use the same basic HTML template you used for every other page of your site.
As a solution for the problem of crawling big websites Google has suggested its Sitemap program. Google claims that its Sitemap Technology was created in order to list sites much faster (http://www.google.com/webmasters/sitemaps/docs/en/about.html). The idea is that you inform Google about your site, the quantity of pages, the frequency of updates and their regular or irregular basis. It also gives you the ability to see your site from the Google's point of view, i.e. learn about errors
(https://www.google.com/support/webmasters/bin/answer.py?answer=35120&topic=8474). It is free and gives you an opportunity to index new or changed pages on-the-fly if they conform to Google standards.
And what does this mean in practice? You place the Urls of your pages and point how Google should index them into an XML document. The crawler reads this information and if the pages correspond with Google standards they are indexed very quickly.
Here you can observe a simple example of a sitemap file:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<url>
<loc>http://www.yoursite.com/</loc>
<priority>1.0</priority>
<lastmod>2005-07-03T16:18:09+00:00</lastmod>
<changefreq>daily</changefreq>
</url>
</urlset>
There are several important variables here. <priority> explains Google spider the order of the pages' indexation. 1.0 is the highest priority, 0.0 is the lowest. By default it is recommended to be set at 0.5 if you don't what to specify it. If some pages are more important than others then higher priority will increase their importance in Google.<lastmod> means last modified and prevents spiders from recrawling pages that have not changed since last modification. <changefreq> means change frequency. This parameter can be very important if your update your pages frequently. Such modes are available - always, hourly, daily, weekly, monthly, yearly and never.
For constant update of your site in Google's massive index or database you should have a Generator that will spider your site, list the urls and export this information to Google. Don't forget that the option of a simple text file submitting is also offered by Google.
There are many generators and different ways to build an XML sitemap file but the most common are as follows:
- Google's Python Generator.
- A PHP Generator.
- Free Online Generator.
“Google's Python Generator
That's a relative term, if you know your server like the back of your hand and installing scripts doesn't scare the bejesus out of you, you're probably smiling at the word difficult. Google supplies a link to a generator which you can download and set up on your server. It will cough up your sitemap XML file and automatically feed it to Google.
In order for this Generator to work, Python version 2.2 must be installed on your Web server - many servers don't have this. If you know what you're doing, this will probably be a good choice.
You don't need a Google Account to use Sitemaps but it's encouraged because you can track your sitemap's progress and view diagnostic information. If you already have another Google Account - gmail, Google Alerts, etc. just use that one to sign in and follow directions from there.
To submit your Sitemap using an HTTP request, issue your request to the following URL:
www.google.com/webmasters/sitemaps/ping?sitemap=sitemap_url
A PHP Code Generator
This is a php generator that you can place on your server. This generator will spider your site and produce your XML sitemap file. Download the phpSitemapNG and upload it your server. Run the generator to get your XML sitemap file and send it to Google.
Again, this is only hard to do if you don't know your way around PHP files or scripts.
Free Online Generator
These Generators are popping up everywhere and Google now keeps a list of these 'third party suppliers' of generators on their site. Find them here: http://code.google.com/sm_thirdparty.html.
One of the easiest to use is http://www.xml-sitemaps.com and you can index up to 500 pages with this online Generator very quickly and it will give you the sitemap XML file Google needs to index your site. It will go into your site, spider it and index all your pages into an XML sitemap of your site. You can download this file, compressed or non- compressed and make minor changes such as setting the priority, changing frequency, etc.
Then upload this file to your site as sitemap.xml to the root directory of your server i.e. where you have your homepage. Then notify Google Sitemaps of your XML file and you're in business.