Memcached or when 70% is Full

I’ve been looking at a problem with our memory cache system at work for the last week and the problem finally clicked into place.

At work I developed this really awesome benchmarking library which exports statistics from various subsystems in our robot. It works really well – too well sometimes.

Its been telling me that one of my key caches is only functioning at 20% efficiency.

This is a memcached cache which means its broken out into many small servers running 256M or 512M or memory (really whatever they can spare).

Each of these servers was reporting that they are only 70% full so therefore I don’t need to add any more memory right?

Wrong. I was look at the number of stored bytes vs the total maximum amount of bytes allocated to memcached.

The problem is that the maximum number is theoretical (which I already sort of knew). The way memcached is setup can allow for memory fragmentation.

Which leads me to Brad’s post on the subject:

Also not ideal about slabs of size 2^n is memory efficiency. Empirical evidence (41 GB of LiveJournal memcached data) shows that 30-35% of the memory is wasted. (if you want to store a 129 byte object, it has to go in the 256 byte class.)

Brad says that in LiveJournal’s case 30% was wasted hence my 70% efficiency numbers.

I’d really love to see a CPU benchmark comparing a linked list LRU implementation vs the current slab implementation. My gut tells me that using a LinkedList would be much better than the current slab allocator and the modest CPU hit would probably be worth it.

I wonder what the status of this is…..

Ironically it turns out that stealing memory from the MySQL process and handing it over to the Memcached process on the same box should yield a significant performance boost.


  1. I know its a bit of an overkill and no that easy to implement, but implementing a Garbage Collection mechanism for memcached seems like the best option.

    Something similar to what .NET have (although it can be a lot simpler).

    This will solve a few things including: fragmentations, garbage collection for things that are not that accessed – meaning you can store only the things that are REALLY in need of caching.

    I’m not monitoring the mailing list of memcached, but I suppose this probably came up at some point (and if not, it should).

  2. I’m not sure if you know, but telnet into your memcaches and type ‘stats slabs’, and it will tell you all about the different slab allocations; which ones are actually being used, and which ones are being wasted.

  3. One more thing which could be used is less than 2 power for slabs. This is how Hoard memory allocator is implemented – it uses 1.4 as base which can be adjusted. It also would use direct mmap allocation for object over certain size.

    In any case it would be interesting if memcached could optionally use ether its own allocator or operation system one.

    Speaking about your experience about taking memory from MySQL for memcached improving performance it is quite expected for many workloads. It may take several reads for MySQL to build single object, which could mean more memory is needed to cache the same amount of distinc objects.

  4. > One more thing which could be used is less
    > than 2 power for slabs. This is how Hoard
    > memory allocator is implemented – it uses
    > 1.4 as base which can be adjusted. It also
    > would use direct mmap allocation for object
    > over certain size.”

    Hm…… I haven’t seem the Hoard memory allocator. I’m actually using Memcached 1.1.13-pre which is an official branch incorporating Facebook patches that incorporate a 1.25 power slab allocator.

    There’s a lot of good stuff here including CPU usage patches and so forth. Seems to be working without any problems on 50% of my servers so I might upgrade the remaining.

    > In any case it would be interesting if
    > memcached could optionally use ether its
    > own allocator or operation system one.

    It can use malloc() via a ./configure option but its very slow…

    > Speaking about your experience about taking > memory from MySQL for memcached improving
    > performance it is quite expected for many
    > workloads. It may take several reads for
    > MySQL to build single object, which could
    > mean more memory is needed to cache the
    > same amount of distinc objects.

    I guess I was expecting the Linux page cache to kick in and buffer this stuff but it doesn’t seem to be living up to my expectation.

    I think we need a MyISAM row cache implementation that uses memcached. That would rule….. There’s one for Postgress …

    Anyway… I’m probably going to blog about this later…

  5. I did not know it has option to use malloc via configure. I’m however curious which malloc type it uses. Non-threaded allocator in Linux was slow few years ago, threaded being much better. I’m not sure however how it changed over GlibC Development.

    Speaking about Linux buffer cache, it probbaly did its job but it just needs more space to buffer the stuff. Imagine for example you need to buffer 1000 random 100 byte rows from 10GB table. How much memory do you think it will take ? 100KB ? Wrong. Page Cache operates in pages so unless you happen to have few rows from the same page it will take 4MB – quite a diffrence. Not even to mention you might need several rows cached for each object.

    Speaking about memcached cache for PostgreSQL – do you have link for patches ? It would be interesting to do some benchmarks.

    I’m however quite pessimistic about it, unless your queries are just doing couple of row reads. Local network is fast and you probably can get 10.000 rows retrieved per second, while it can be close to 1.000.000 working with local file cache (or buffer pool)

    This is for example wht Joins are so slow at the moment with MySQL Cluster






%d bloggers like this: