< Back to Blog

Adding Pages to robots.txt Takes Time to Work
Mon, 17 Dec 2007 13:15:04 by Kerry Dye

I wrote a while ago about ways to exclude parts of your site from the search engines. In the section about robots.txt removal I noted that the response was not instant, but I didn't elaborate any more than that, so I thought I would revisit that subject having observed the response to some of my robots.txt changes over the last few months.

In Google, the bottom line on removals is that the page won't get removed until the spider revisits the page. On a small site that is visited often, this happens really quickly. However on a larger site with many pages, it may be months before the spider revisits the page.

You might have thought that it would work in a different way - that if you added a directory to the robots.txt file, then the first time that robots.txt file was downloaded, Google would go "Aha!" and remove from its index anything that matched that rule. But that isn't what happens; it is done on a page by page basis as the spider finds the page, which is then matched to the rules in the robots.txt.

I have had a lot of SEO success with removing "low value" pages from Google - pages that are almost-duplicates, but because of the way that this is implemented, the effects can take different amounts of time to show themselves. With a small site, the results are pretty instant - just days can pass before the site races up the rankings with its new more relevant page selection. In the case of larger sites with tens of thousands of pages, the result is far more gradual, as each page is revisited less often, and the removal process is much slower.

Is there a solution? Well, although Google Webmaster Tools allows you to do removal requests for URLs that you want removed, each one has to be entered by hand, this is time consuming for more than a handful of pages (ask my colleague Pete - he removed nearly 300 URLs for a client). However, this is the only quick way to do it (and it still takes a couple of days to be implemented).

If your removal pages are deeplinks, which are low down the crawling hierarchy on the site, a possibility for speeding these up is to provide a site-map like page of those links accessed temporarily from your home page (which you remove when it has done its job).

The final option is just patience - something that search engine optimisers are quite good at - eventually the links will be removed and your site should climb the results and the page ranking for the remaining pages is improved as a result.



Kerry Dye
Campaign Delivery Manager


Subscribe

Archives

Related Blogs
Google's New Personalised Search Update - SearchWiki
Fri, 21 Nov 2008 15:53:29 by Pete Handley
What country specific content?
Wed, 19 Nov 2008 17:05:07 by Joe Bursell
Search Engine Optimisation Means Different Things to Different People
Wed, 19 Nov 2008 12:13:31 by Kerry Dye
More on Title Tags
Wed, 19 Nov 2008 09:44:47 by Emily Mace
eBay Launches Pay-Per-Click Advertising Program
Tue, 18 Nov 2008 14:58:55 by James Daniels
Google Trial Adwords on Youtube
Tue, 18 Nov 2008 14:40:59 by James Daniels
Google Base, are you missing a trick in your SEO?
Tue, 18 Nov 2008 13:46:11 by Emily Mace