Using ON DUPLICATE KEY UPDATE to improve MySQL Replication Performance
Here’s a fun little performance tweak you can use to increase the throughput of your MySQL slave boxes.
MySQL slave replication is single threaded when it comes to running SQL. If you have a number of clients performing INSERTs at the same time on a master server these are serialized and replayed on your slaves. The master has an IO advantage in that the controller can batch operations (especially on SCSI and RAID controllers) and generally perform more intelligent IO which the slaves aren’t capable of handling. Apparently this is slated to be fixed in MySQL 5.1 but I’m not currently aware of the details.
Rewriting your UPDATE/INSERTS to use ON DUPLICATE KEY UPDATE can allow you take individual statements and rewrite them. For example you can do:
INSERT INTO FOO
ON DUPLICATE KEY UPDATE BAR=VALUES(BAR)
Instead of issuing two individual INSERT or UPDATE statements.
Here’s the full documentation on ON DUPLICATE KEY UPDATE
This then allows your slave servers to run this as one statement thereby improving performance. You also have the added value of reducing the query optimizer parsing of this statement so that this is only executed once instead of twice. Even when using prepared statements this can become a performance issue.
In our environment we were able to increase total overall throughput by 2-3x. This also prevented lbpool from taking slave servers out of production by 20% (they were no longer falling too far behind replication). Benchmarking bulk inserts individually also noted a 10-15x performance update in general for batched vs individual operations.