SEARCH MARKETING BLOG

Yahoo! Uses Distributed Computing to Speed up Search Index Processing

After wading through the official blogs http://www.ysearchblog.com/archives/000521.html
on this subject, which mainly deal with the fact it is the world’s largest Hadoop installations, I’ve got a few interesting facts from the SEO point of
view.

  • The Webmap (which is the database that feeds their algorithm) now generates in a third less time than it did before
  • It keeps track of roughly 1 trillion links
  • It uses 10,000 Linux cores (which doesn’t mean 10,000 computers or even 10,000 processors, as processors are multi-core, but I guess it makes a nice round number)

So hopefully that’s some information that might actually be of use when talking to someone non-technical or if you have a need to discuss it with a client.