SEARCH MARKETING BLOG

Checking your Robots.txt file

A robots.txt file is recommended for all websites to help direct search engines around your site.  A robots file should always contain a  link to your sitemap.xml file as below

Sitemap: http://www.yoursite.co.uk/sitemap.xml

Once you have created your robots.txt file it’s important to be sure that it will work correctly before uploading it.  A simple robots.txt file probably won’t need testing but if you are using wildcard rules to disallow groups of pages from the search engines it’s probably best to test it’s working correctly before committing it to the site.

The best way to check your robots.txt file is to use Google Webmaster Tools.  To do this login to Google Webmaster Tools and select Site Configuration from the left hand menu and click on the Crawler Access option on the sub menu that appears.  This will bring up a window which shows you the robots file for your site.  About half way down this page there is a text box which shows the current text of your robots.txt file.

Using this text box you can add in the new lines you have added to your robots.txt file so that the box reflects the text that you want to test.  Scroll down below the box and you will see another text box called “URLs specify the URLs and user agents to test against” in this box type in the URL of the pages you want to check to see if Google can see or not.

For example if you have added the following rules to your robots file which reads:

Disallow: /admin/
Disallow: /*?cm=*

Then add to the URLs box the following URLs below your domain name:

www.yoursite.co.uk/admin/
www.yoursite.co.uk/index.html?cm=1234

Click the text button and you will receive results for the pages you have added to the URLs button.

The homepage of your site should be Allowed but the other two URLs should show that they are Disallowed.

If all is OK with the results you can upload your Robots.txt file but if not you can tweak the contents of the file and test again.

Doing this ensures that your robots file will be working when Google reads it and that you aren’t damaging your rankings by banning Google from the wrong pages.

This entry was posted in Search Marketing Blog by Emily Mace. Bookmark the permalink.

About Emily Mace

Emily joined Vertical Leap as an SEO Campaign Delivery Manager in 2008, having gained wide search marketing experience as a web developer, SEO specialist and trainer for local Government departments and Tourism South East. Emily gained Google Analytics Individual Qualification in 2011, and regularly blogs on the technical aspects of SEO, sharing her expertise with our readers.