The Differences Between Meta NoIndex, NoFollow and Robots.txt File

April 6th, 2009 | | Crawlability

A few weeks ago SEO Zombie wrote an interesting post on the flow of internal link juice.  The post itself goes into ways to prevent duplicate content issues on your blog if you’re running WordPress. However, what I found most helpful was the explanation from Matt Cutts on the differences between the NoIndex tag, NoFollow tag and the Robots.txt file. So if you’re confused at all about what the three of these do and when you need to use them – this should help clear some things up.

According to SEO Zombie, Matt Cutts explains them as the following:


Pages that are included in a Robots.txt file tell the search engines NOT to crawl these pages.  However pages in a Robots.txt file can still accrue PageRank and can be indexed in search results, says Matt Cutts.

Why you should use a Robots.txt file:  I like to exclude images and misc. files on my site from being crawled. I don’t want the search engines to waste time on those pages over more important pages of my site – Therefore, I add them in the Robots.txt file so they’re not crawled over more important pages. So if you’re trying to stop the flow of PageRank you’ll need to do more than just add a page to the Robots.txt file…


The NoIndex tag means the search engines can crawl the page and give it PageRank, however the search engines are not to index the page, and it will not show up in the search results. Again, a page with the NoIndex tag can accumulate PageRank, because the links are still followed outwards from a NoIndex page.

Why you should use a meta NoIndex tag: I would use this tag only if I for sure did not want my page to be indexed within the search engines, but I do want that page to accrue PageRank and pass that on to other pages of my site. For example, shopping cart pages that are dynamically driven or a contact us form you may not want to be indexed.


Finally, a page with a NoFollow tag tells the search engines that yes, this page can be crawled, but don’t show this page at all in Google’s Index, and don’t follow any outgoing links, and no PageRank flows from that page.

Why you should use a meta NoFollow tag:  Again, if there are more important pages of your site that you would rather the search engines assign PageRank to and index, then you would want to use the NoFollow tag. Most people use the NoFollow and the NoIndex tags together so that a page like a shopping cart page that is dynamically driven isn’t  indexed or assigned any PageRank. The more pages you have on your site, the more your PageRank is distributed between all of them. By eliminating miscellaneous pages from being indexed or passing on PageRank, the more PageRank your important pages get.

It’s also important to know the difference between the NoFollow meta tag and a rel=”nofollow” link tag. Using a ‘nofollow’ tag on a link will only prevent PageRank from flowing through that link. But all other links on a page will pass on PageRank.  Of course if you add the NoFollow meta tag onto the whole page, it prevents all links on that page from passing on PageRank.

Using these tags really provide you the opportunity to control what the search engines do with certain pages of your site, and allow you to sculpt where you want your PageRank to flow.  Just remember that the more important pages of your site that will benefit the user the best are the pages that you want the search engines to focus more time and attention on, not the miscellaneous pages of your site.

  • http://www.greenlemon.in Faiza

    I would like a clarification from the below point you mentioned:

    “Most people use the NoFollow and the NoIndex tags together so that a page like a shopping cart page that is dynamically driven isn’t crawled, indexed or assigned any PageRank.”

    NoFollow and NoIndex doesn’t hinder crawling…right? Then how did you say that both used in conjunction prevents the page from being crawled?

  • http://www.hanapinmarketing.com Amber

    @Faiza, you are correct, if you use a NoFollow and NoIndex either together or separately, the pages can still be crawled – I’ll fix this in my post, thanks for catching that!

  • http://www.greenlemon.in Faiza

    Hi Amber,

    I just realised something while I was tweeting this post. You don’t have social bookmarking buttons except for Sphinn. I bet you are missing a great deal on that!


  • http://articlesfind.com Articles

    So if you add a robots.txt file with certain pages, those pages may still pass page rank? Is there a way to block the flow of page rank using the robots.txt file by itself?

    Thanks for a helpful post!

  • http://www.tajmahaltours-india.com/ shiv

    i want to say some thing about


    Pages that are included in a Robots.txt file tell the search engines NOT to crawl these pages. However pages in a Robots.txt file can still accrue PageRank and can be indexed in search results, says Matt Cutts.”

    How Can A page being indexed without crawling? As pr my knowledge a page cant be indexed without crawling

    • http://www.librariaatlas.ro Coman Teodor

      In such cases the pages can be indexed if there are links to it on other sites. For example:
      if http://www.xyz.com/abc.html is set as in the robots.txt not to be crawles, but on, let’s say http://www.aaa.com there is this the link http://www.xyz.com/abc.html, then the page will be indexed and ranked according to whatever the crawler can use and that are the following: the anchor text, the name of the link itself and also according to the subject of the page in which the link is found.

  • http://twitter.com/joomlads07 William Smith

    Great work Amber! I was not clear about these terms earlier . Thanks!

    William!

    • http://www.hanapinmarketing.com Bethany Bey

      Hi William,

      I’m glad we could help you understand the terms better. Thanks for reading!

  • http://howtonotsnore.blogspot.com Jack L

    Ooh. So that’s how it works. I’ve heard sometimes that nofollow links still pass some link juice, if not as much as dofollow ones. Do you have a definite opinion on that?

  • http://www.facebook.com/madhuamitha Madhaumaitha Mahesh

    Very informative post about how to use robots.txt fule and no index no follow