Is MySQL Binary Data Replication Broken?

What’s the deal with replicating binary data in MySQL? Is anyone out there doing this successfully?

We ran into an evil problem last night with 4.1.22. Apparently, if you replicate binary data (in this case a gzip stream) it will break replication. The data inserts correctly on the master but fails to insert correctly on the slave.

Here’s the real kicker. It fails to replicate SILENTLY. Replication falls behind but Seconds_Behind_Master still reports zero seconds [1].

We can of course base64 encode the data but this will result in a 33% data bloat. Not exactly fun. We’re going to do this in the short term but this bug really needs to be fixed.

There appears to be a few bugs here which are somewhat related including this, this, this, this, this and this.

I was just talking to Jeremy Cole online and he’s reported something similar. In fact others have argued for binlog checksuming to help identify these problems (silent data corruption is evil).

It really bothers me that it’s almost 2008 and MySQL has problems where it can’t do something as simple as store binary data.

1. This isn’t the only insanity I’ve found with Seconds_Behind_Master. If the master has an INSERT larger than max_allowed_packet the slave will get stuck in a retry loop writing errors to the log and reporting Seconds_Behind_Master of either NULL or zero the whole time.

  1. Seconds_Behind_Master of NULL doesn’t really bother me because then at least you know there is a problem. Seconds_Behind_Master of 0 when replication is not working is lame. We have had this issue in the past, although I am not sure as to the root cause since it was transient. I have thought about monitoring the Exec_Master_Log_Pos and making sure its incrementing more than X positions per Y seconds, and if not, alert or take the DB out of rotation for production reads/writes. Super-hack, but what can you do when MySQL doesn’t report the correct data.

  2. Yeah…. I’ve been thinking about that today. One way to potentially handle this is to look at the rate of binary log processing and assume that on average X bytes take Y minutes to process.

    Maybe some sort of hack on the client side might also be interesting.

  3. xaprb

    Jeremy Cole contributed a heartbeat script to MySQL Toolkit that Does Not Lie(tm). I think this should solve the problem.

%d bloggers like this: