Great read on Hadoop’s 4000 machine scalability wall.

Hadoop has hit a 4000 node scalability wall.

Most of us aren’t running into this wall but it’s interesting to see what it’s happening.

The elasticity of clouds make this more of a challenge. Most people don’t normally need 4000 noes but say you want to do something REALLY fast. I could see wanting to spool 8000 -10000 nodes to compute PageRank or a clustering algorithm and wanting it to finish FAST.

Having a limit on 4k nodes is a challenge.

 

Given observed trends in cluster sizes and workloads, the MapReduce JobTracker needs a drastic overhaul to address several deficiencies in its scalability, memory consumption, threading-model, reliability and performance. Over the last 5 years, we’ve done spot fixes, however lately these have come at an ever-growing cost as evinced by the increasing difficulty of making changes to the framework. The architectural deficiencies, and corrective measures, are both old and well understood – even as far back as late 200

[From The Next Generation of Apache Hadoop MapReduce · Yahoo! Hadoop Blog]

 


  1. Clusters > 4,000 nodes are commonly used in supercomputing… and they’re batch systems. I always find it a bit disappointing that the various parts of computing ignore relevant experience from a different part of computing.






%d bloggers like this: