Storing the Full Internet

The other day I blogged about Blekko and what it would take to in terms of hardware index the full Internet.

High Scalability responded with some interesting thoughts.

Kevin Burton calculates that Blekko, one of the barbarian hoard storming Google’s search fortress, would need to spend $5 million just to buy enough weapons, er storage.

Kevin estimates storing a deep crawl of the internet would take about 5 petabytes. At a projected $1 million per petabyte that’s a paltry $5 million. Less than expected. Imagine in days of old an ambitious noble itching to raise an army to conquer a land and become its new prince. For a fine land, and the search market is one of the richest, that would be a smart investment for a VC to make.

The comments are interesting.

  1. onotech

    Now calculate how much the bandwidth would cost to shift 5 petabytes from point A to point B in 90 days.

    5,000,000 gigabytes * 8 bits per byte * overhead factor 2
    7,776,000 seconds

    = 10 gigabits continuous to refresh the index 4 times a year.

    Hmmm. A quick n dirty search for “gigabit bandwidth cost” suggests this would be about $150K per month. Make that $400K per month if you wanted to refresh every 30 days. Not bad!

  2. Good point.. I didn’t even compute the bandwidth costs.

    AKA this stuff is expensive :)

