MySQL Storage Engine Services

In a comment while we were discussing his Queue Engine Brian Aker mentioned his concept of “engine services” which is something I’ve been thinking of for a while now.

“Like partitioning, this could be wrapped around any engine (I call these engine services).”

I’ve been thinking about something similar for a few weeks.

There are a number of features (partitioning is an example) that could be implemented by wrapping existing tables with new table definitions.

This might be closer to federated tables rather than partitioning.

With federated tables you define a new table with a CONNECTION parameter which points to another table.

A storage engine service could be like a regular table but with additional metadata necessary to provide initialize it.

Transparent encryption or compression1 would be an example.

Right now we use BLOB compression to save disk space and improve IO performance. We have TONs of free CPU since we use a distributed crawler so we compress on the client and send the data to the MySQL daemon already compressed.

The problem is that the data from SELECT returns already compressed data.

This makes it difficult to debug because MySQL doesn’t know about our compression scheme2. If I were to write a storage engine service which implemented transparent decompression I could then run SELECTs on a ‘viewport’ of this table which would then transparently decompress the data before making it visible to MySQL.

I could then perform LIKE queries on these BLOBs to find the data I’m looking for.

1. This might be a somewhat complicated example because other database systems like Bigtable are able to get exceptional compression rations because they implement block level compression.

2. Technically we could use the mysql UNCOMPRESS function and store our data in zlib format. However, MySQL’s braindead zlib support mangles the external representation of the zlib data. I haven’t yet had a chance to grok the braindamaged function so I couldn’t make our zlib output compatible. I’m also having to base64 encode the data due to a potential but in MySQL replication.

%d bloggers like this: