Welcome to SEO Boy, the authority on search engine optimization -- how to articles, industry news, insider tips, and more! If you like what you see, you can receive free and daily updates via email or RSS.
Print This Post Print This Post

Need Another Reason to Use XML Sitemaps? Compare Index Stats in Google Webmaster Tools

January 29th, 2009 | | Crawlability

As we’ve discussed before, submitting an XML Sitemap to Google (and the other search engines) is an openly debated practice. Proponents of sitemaps tout benefits to indexation and visibility with the search engines. Those against hold a more principled stance – that your site should be optimized in such a way that submitting an XML Sitemap isn’t needed. I for one fall under the proponent category, and I’ve found another reason to back up my case: Google Webmaster Tools’ Sitemap Details page. In addition to giving you details on when you first submitted your sitemap, when it was last downloaded and its current status (“OK”), this details page offers an interesting look at how well your content is being indexed compared to the data submitted via sitemap via the “Indexed URLs in Sitemap” statistic:

Everyone knows how (or should know how) to use the site: command to review pages indexed in Google. This is the primary method for recognizing how well, or how deeply your site is being crawled and indexed. But this Sitemap Details information adds a new stat into the mix, one that I’m discovering with a mix of satisfaction and curiosity. What exactly does this stat mean? I’m not 100% certain, but I’ve got a few ideas of what you can do with the information.

  1. This statistic can give you a quick appraisal of your overall performance in terms of indexation. Check the Sitemap Details page regularly to keep tabs of how well you’re doing.
  2. Use this as a diagnostic tool. If you find that your website traffic has dropped, and you’re unsure what’s to blame – check to see if your sitemap-to-index ratio has dropped. This could mean that despite having your URLs in the sitemap, there’s a crawlability issue on your site that needs fixed.
  3. Regardless of performance, this statistic could be the impetus to sniff out crawlability issues on your website. If you find that your sitemap-to-index ratio is always down, you’ve got work to do.

I should point out that Google specifically states that this data is “a close approximation of the status of your URLs” and that “this figure might not be 100%.” However, when I checked the stats for SEO Boy today, the number of URLs listed in GWT for my XML Sitemap was 120 after having been downloaded an hour prior – and 120 was right on. My “Indexed URLs in Sitemap” was 115 – which means I’ve got some work to do!

Facebook   IN   Stumble Upon   Twitter   Sphinndo some of that social network stuff.
  • http://www.bestwebimage.com Rob

    Love your logo, and header, not sure about the post though. I have a few sites that would disagree. I have one that has over 30,000 pages in the sitemap, and only 8,000 are indexed. My slower sites (slower in growth) tend to agree with you though.

  • http://www.seoboy.com John

    @ Rob,

    Not 100% sure what you’re agreeing/disagreeing with. I was merely pointing out that by submitting an XML Sitemap to Google, you could view the ratio of pages listed in your sitemap to those actual indexed (as reported by Google).

    It may very well be possible that your 30K page site only has 8000 indexed. While I’m a proponent of using and submitting sitemaps, I’ve never been one to naively believe that it’s a ticket to 100% indexation! You still have to work on the fundamentals of internal linking and site structure to ensure that Google, and the rest of the search engines, can find your content and index/rank it appropriately!

    Thanks for commenting!

  • http://pcbix.dk Peter

    @both =)

    While I have a site with 40k pages and 40 sitemaps files with 1k urls each (1 line, 1 url) and getting 50% index ratio and basically having the same situation as Rob when I had 1 sitemap file – I’m still wondering how I could get more like 100% – The customer should have the possibility of finding the right page…
    The site is very much so an ecosystem in its own right and thus ever changing.
    Using more (and ever more) sitemaps solved – rather helped – the 8000/30000 index ratio situation… buuuut, it’s a little lame… even if it works, its a lot of extra work.
    I’m pretty sure internal linking could not really be the “baddy” here – ’cause all pages are linked to from the top and all pages links to all pages in the same group through “list views” of those groups – as I’m sure Robs dynamic 30k pages site is too in some way or another… Rob? What site are you referring to?
    I would probably never use the XML solution, as I too is in great need of another reason.. =)