Scaling IO During Software Releases

One of the problems I’m running into right now is that we have too many machines to perform quick and fast software distribution for new releases. Our application is about 200M (with all configuration files and binaries) and pulling the files locally and performing a diff just takes a while.

We use one central subversion repository to checkout builds across all our boxes which acts as a bottleneck. The machine can literally only serve data fast enough to allow 2-3 concurrent pushes.

I could build a faster IO subsystem by buying bigger and faster disks but this just seems like another project in and of itself.

It dawned on me that I could use the boxes THEMSELVES to push binaries over unicast and use exponential distribution to yield a significant performance boost. (multicast isn’t really an option in our situation).

Instead of machine 1 pushing to machines 2, 3 … N I could just have machines push in parallel.

1 pushes to 2 … then 1-2 push to 3-4 … then 1-4 push to 5-9 then 1-9 push to 10-19 .. etc.

This allows for 1, 2, 4, 8, 16, 32 … N machines to be modified in parallel.

… and no dedicated disk subsystem is required. I’d need to perform a trivial hashcode of the distributions to make sure nothing was corrupted but that shouldn’t be to hard.

  1. For file distro, you could use something like PCP:

    But, tbh, for $work we just have it setup to have multiple versions installed — so you can ‘pre-install’ across the entire cluster, and then when everything is ready, its as simple as swapping symlinks and restarting any services, rather than installing/copying being part of that cycle. This lets you do the file copy before dinner, and the main ugprade, after dinner :P

  2. Yeah… that’s one way to do it… I think that’s the way I do it now but it’s just a bit annoying to have to wait ;)

    Plus if you have too many machines it might end up being next-wednesday’s dinner ;)

%d bloggers like this: