Thoughts on InnoDB Page Compression

I spend the last couple days playing with InnoDB page compression on the latest Percona build.

I’m pretty happy so far with Percona and the latest InnoDB changes.

Compression wasn’t living up to my expectations though.

I think the biggest problem is that the compression can only use one core in replication and ALTER TABLE statements.

We have an 80GB database that was running on 96GB boxes filled with RAM.

I wanted to try to run this on much smaller instances (32GB-48GB boxes) by compressing the database.

Unfortunately, after 24 hours of running an ALTER TABLE which would only use one core per table, the SQL replication thread went to 100% and started falling behind fast.

I think what might be happening is that the InnoDB page buffer is full because it can’t write to the disk fast enough which causes the insert thread to force compression of the pages in the foreground.

Having InnoDB only use one core / thread to compress pages seems like a very bad idea (especially on 8-16 core boxes, I’m testing on an 8 core box now but we have 16 core boxes in production).

The InnoDB page compression documentation doesn’t seem to yield any hints about when InnoDB pages are compressed and in which thread. Nor does there seem to be any configuration variables that we can change in this regard.

Perhaps a ‘compressed buffer pool only’ option could be interesting.

This way InnoDB does not have to maintain an LRU for compressed/decompressed pages. Further, it can read pages off disk, decompress them, and then leave the pages decompressed in a small buffer. Then a worker thread (executing on another core) can compress the pages and move them back into the buffer pool where they can be stored and placed back on disk.

This process could still become disk bottlenecked but at least it would use multiple cores.


  1. Heikki Tuuri

    Kevin, InnoDB page compression/decompression is multi-threaded. Marko Mäkelä, the developer of the compression says:

    “the compression and decompression always occur in the session thread. If several connections are accessing the same table, each of them will compress or decompress pages as needed. Decompression is avoided by caching the uncompressed page in the buffer pool, and compression is avoided by the per-page modification log. (If you make key_block_size very small, then there will not be enough room for the modification log, and every UPDATE or INSERT will result in full page compression. DELETE is always executed in-place, without compressing.)”

    Best regards,

    Heikki

  2. Hey Heikki.

    I completely missed that. I re-read the documentation specifically to verify that I didn’t miss anything yet I still didn’t see this comment :-/

    Anyway. I think the issue here is that if the replication thread can’t keep up, then you’re basically doomed when using compression.

    This is an 8 core box…. I should be able to use all the cores to compress the SQL thread.

    Of course, of all of those go to 100% then I’m in more trouble :)

    Also, you might want to look at other compression algorithms.

    “On Compressing the Textual Web” from WSDM2010 was a really good paper on the subject.

    LZOP is probably the best bet. Apparently, there are also more details on Zippy from Google that have been published.

    These can get 80% of gzip / compress throughput but MUCH faster compression rates.

    Kevin






%d bloggers like this: