SEARCH MARKETING BLOG

SEO Speak what is a Spider Trap?

We’ve spoken about spider traps on the blog so I thought I’d go into a bit more detail about what a spider trap is as part of my SEO Speak series.

When search engine crawlers discover pages on a website they are following the links you provide them on your website, so if there is some glitch in your internal linking structure the search engine bots could get caught in a loop.

A spider trap is a series of links or pages on a website which cause search engine spiders to get caught in a loop creating an infinite number of pages which don’t exist on the site.  For example www.yourdomain.com/products/items/products/items
/products/items/products/items/products/items/products
/items/products/items/products/items/products/
items/products/items/products/items/products/items/
products/items/products/items/products/items
/products/items/products/items/products/items/products
/items/products/items/products/items/products/items
/products/items/products/items/products/items/products
/items/products/items/products/items/products/items/
products/items/products/items/ but through the links on the page the search engines are finding an infinite number of the product and item pages on the site which are added to the end of the URL.  This page doesn’t actually exist and as the loop keep creating new links this can cause an issue on the indexing of your site.  This can often happen when you have dynamic content on the site such as a calendar or a dynamically generated site structure.

Kerry mentioned the effects on your search engine crawling in her blog about Google helping with the crawl efficiency of your website.  You website could have 150 pages but if you have a spider trap Google and the other search engines could be seeing thousands of pages in their index.  Another issue that could be caused by having a spider trap on your website is that the search engine crawlers might not have found all 150 actual pages on the site before getting lost in the spider trap.  As a result of this the time it takes for your other pages to be indexed will be affected while Google follows the spider trap.

So how do you stop Google from crawling this spider trap?

Well the first thing to do is to identify what is causing the problem and on what pages this is occurring.  A good way to do this is to watch for issues when creating your sitemap.xml file.  We’ve found a number of occasions when the creation of a sitemap.xml we have found spider traps on the content of the website we are looking at.

Once you have identified the issues on the site use your Robots.txt file to exclude these pages which are creating a loop by stopping search engines from seeing these items.  This will help all the real pages on your site to be found by Google.

This entry was posted in SEO Blog and tagged , by Emily Mace. Bookmark the permalink.

About Emily Mace

Emily joined Vertical Leap as an SEO Campaign Delivery Manager in 2008, having gained wide search marketing experience as a web developer, SEO specialist and trainer for local Government departments and Tourism South East. Emily gained Google Analytics Individual Qualification in 2011, and regularly blogs on the technical aspects of SEO, sharing her expertise with our readers.