Thoughts on Munin, Performance Monitoring, and SVG

I spent some time the other night hacking on Munin to get it to produce SVG output and wanted to serialize my thoughts.

The basic premise that I’m working on is that SVGs should be MUCH faster to generate on the server, send over the wire, and manipulate on the client (zoom,
interact with, etc).

The results are somewhat disappointing.

First…. what’s good.

* It’s somewhat easy to have Munin generate SVG output. You just have to hack a
couple of scripts.

* The SVG output is certainly smaller. It’s about 1/2 the size as the PNG

Now the bad.

* The SVG rendering on the server is NOT any faster (at least in my unscientific
benchmarks). This might be isolated in rrdtool or in munin itself.

* Munin will internally need to be reworked to to use OBJECT tags intead of IMG
tags since this doesn’t seem to be supported under Firefox or Safari.

* SVGs were overally MUCH SLOWER since for *some* reason munin didn’t build them
lazily and each iteration a new graph was created.

* Filesize wasn’t my intial concern as much as time. The time to render each
graph is 2-4 seconds which is unacceptable.

* The default font size in 2.15 is way too small.

Perhaps SVG is the wrong choice. Maybe Apple canvas.

I was also thinking that sending over an SVG generated on the server and into the browser might be less than ideal.

Why not send over json output? The client would then load a Google maps-style UI from this data and render the graph.

Since it’s DATA there are more client manipulations that are possible on the client such as

  1. quellish

    Have you taken a look at Ganglia? Its similar to Munin in some ways, but collects all of its data via multicast – which makes it much easier to manage over a large cluster.

  2. Yeah…. ganglia is worse in a lot of ways.

    I should really write down a list of what’s wrong.

    One is security. If you’re on a shared network you’ll broadcast your stats which isn’t fun.

  3. allspaw

    Please do give up the list of what you think is wrong with ganglia, I’m interested.

    Every tool of course has its shortcomings, but we find it to have a good amount of stuff out-of-the box when it comes to cluster visibility. Also, multicast isn’t the only option with ganglia, unicast works just as easily at which point if you’re concerned about security you can use whatever you normally would (ipfw/iptables/tcp wrappers/etc.) to get some control of who can get what.

%d bloggers like this: