SSD Vendors: Please let developers obtain extended health and # of erase cycle stats on your SSDs.
Here’s the problem I currently have.
We’re looking at deploying the Intel X-25M MLC SSD in production.
The problem being that this drive has a lower number of erase cycles but is much cheaper. Than the Intel X-25E SLC drive.
However, in our situation we’re write once, read many. I’m 99% certain that we will not burn out these drives. We write data to disk once and it is never written again.
The problem is that I can’t be 100% sure that this is the case. There is btree flushing, and binary log issues that I’m worried about…
What would be really nice is an API (SMART?) that I can enumerate the erase blocks on the drive, determine the max erase cycles, and read the current number of erase cycles.
This way, I can put an SSD into production, then determine the ETA to failure.
I can also add this to Nagios and Ganglia and trend the failure date and alert if the derivative is too high and the drive will soon fail.
Further, I can figure out if a database design is flawed. If I deploy a new database into production and the failure ETA is too high after 24 hours I know that something is wrong. Either a misconfiguration or a problem with the design.
I think this would solve a LOT of the problems with deploying SSD in enterprise environments. (MySQL, Oracle, etc)