Buffered Binary Logs…

One of the things that has always bothered me about replication is that the binary logs are written to disk and then read from disk.

There is are two threads which are for the most part, unaware of each other.

One thread reads the remote binary logs, and the other writes them to disk.

While the Linux page buffer CAN work to buffer these logs, the first write will cause additional disk load.

One strategy, which could seriously boost performance in some situations, would be to pre-read say 10-50MB of data and just keep it in memory.

If a slave is catching up, it could have GIGABYTES of binary log data from the master. It would then write this to disk. These reads would then NOT come from cache.

Simply using a small buffer could solve this problem.

One HACK would be to use a ram drive or tmpfs for logs. I assume that the log thread will block if the disk fills up… if it does so intelligently, one could just create a 50MB tmpfs to store binary logs. MySQL would then read these off tmpfs, and execute them.

50MB-250MB should be fine for a pre-read buffer. Once one of the files is removed, the thread would continue reading data.


  1. Harrison Fisk

    MySQL actually has something similar to this called the IO_CACHE (which is used for the binary logs and relay logs). It buffers and caches reads and writes to files. I think it uses like 64k by default, so it’s pretty small.

    Since it is already used for the binary and relay logs, I imagine it could be relatively easy to extend it to allow a specified size and more control over it.

    Of course, if the logs were moved to a storage engine, then it could make use of the storage engine buffer instead.






%d bloggers like this: