RAM vs SSD Based Databases

Fully RAM based databases are being used in more and more places. For a lot of use cases throwing ALL of your data into memory will have a major performance benefit.

But when should you use RAM vs SSD?

RAM is about $100/GB. SSD is about $30/GB.

SSDs have a finite performance of about 100MB/s for reads. The only time you should throw money at all RAM based scenarios is when you ned to get more performance than a SATA disk can give you.

Of course, this is obviously going to depend on your application.

Update:

Another thought. 100MB is close to 800Mbit which is close to 1Gbit. Gigabit ethernet is pretty much the only solution for networking at the moment.

Even if you DID have a box that could saturate the gigabit link you’re not going to get any better performance by putting it all in RAM.

Interesting.

One could use dual ports, but that’s only going to allow you to use two SSDs.

You could use 10GbE but that’s going to cost you an arm and a leg!

We’re going to deploy with 3 SSDs per box. This gives us a cheaper footprint as we can run with a larger DB however I’ll never come anywhere near the full throughput of the drive if I’m on GbE.


  1. Man, I wish I knew more about this stuff. I suspect a RAM-based db structure would be awesome for another project I’m working on, but (a) I don’t quite know if it’s worth the trouble performance-wise and (b) I wouldn’t know how to do it.

  2. There is a paper that tells you how to calculate this. I blogged about it here: http://0x0000.org/2007/12/the_5_minute_rule.html

  3. Hey Jonathan,

    Yeah….. I’m still planning on reading it. :)

    I’ll bump it up in my reading priority. I seem to have been sucked into being re-addicted to trust metrics at the moment.

    Paul.. it’s simple. Use something like InnoDB and just make sure your entire DB fits in memory.

    Some of the alternative BigTable implementations are getting close to being ready….

    Kevin

  4. I know we’ve talked about this before, but you’re forgetting one very important thing:

    Most web-based DB installations aren’t throughput bound, they’re IO bound. Spinn3r is unusual this way in that you care about MB/s a lot.

    We don’t, and the majority of SQL DBs at web companies also don’t. Instead we just want to know how many IOPS we can do, usually in the 4-16KB range.

    In that scenario, the SSD vs RAM vs HDD debate changes dramatically, and GigE is no longer nearly as big of a limiting factor. Even if it were, 10GigE is no longer super-expensive, either, and people are deploying Infiniband a lot more, too. (Personally, I think I’d love to see Infiniband-like technologies trump ethernet, but I doubt it’ll happen).

    We should compare SSD notes when I can say more about some of the products we’ve seen. Dang NDAs… :)

  5. What about fiber channel disks? Like the IBM solution here: http://www-304.ibm.com/systems/support/supportsite.wss/docdisplay?brandind=5000008&lndocid=MIGR-43536. And the disks here: http://www.ptusainc.com/Ibm-Hard-Drive-73-4-Gb-Hot-Swap-3-5-2gb-Fibre-Channel-15k-32p0768.aspx.

    It seems they’re rated 2Gbps (2 gigabits per second) which is quite nice, but I’m not so sure about pricing. Our sysadmin seems to be quite fond of them though (only theoretically so far).

  6. Sorry, please take out the dots from the end of both links above. Girrr, wordpress.

  7. Hey Don,

    Can you say which products you’ve evaluated? Maybe some Sun hardware? :)

    Also, I totally agree. I only care about IOPS…… However, the SSDs I’m looking at can do about 18k read IOPS. They’re NO LONGER IO bound :).

    So now you start to think about MB/s again (thank god).

    So…. you’re back to saturating gigabit.

  8. Artem, FC controllers are still a bit expensive. Further, if they’re HDDs, they’re going to be slow.

    Don is right. IOPS matter!

    Normal HDDs can only do about 200 IOs per second.

    Which is sad.

  9. Actually, it looks like we saturate a lot of other things, like CPU, before we get close to GigE. It’s always like this, though – solve one bottleneck and it just moves somewhere else in the system. :)

    Luckily we’ll just go 10GigE if we ever get the CPU issue solved…

  10. Don,

    Yup, because you’re using InnoDB :)

    MyISAM trounces InnoDB. I can use the full disk without any significant CPU usage :)

    We’re using InnoDB too though so I feel your pain.

    Kevin

  11. Yeah. Thank goodness for Sun’s acquisition. If there’s a company that’s interested in fixing InnoDB’s concurrency problems, it’s Sun. :)

    BTW, off-topic, but I’m not sure you’ve friended me on Twitter. Either that or Twitter is stupid. :)

    I keep getting spammed with “My updates are protected” but I’m already following you: http://twitter.com/DonMacAskill

  12. When thinking about saturating the 1GigE don’t for get that it takes like 5x the size to store the data on disk, and read it too I think, then it will take to fit it in ram / trasmit it over the wire. So really you are only at like 200Mb over the wire not 800Mb with innodb. I am going to wright my own durable in memory DB for this reason.

  13. The big difference between RAM and Flash SSDs are the random write speeds. They’re still fairly awful for flash though Intel is working on a new breed that are better with writes.

    The server market is desperate for a cheap DDR2 based SSD. The Hyperdrive4 is bottlenecked by its SATA1 interface though it does have good bang for the buck. I don’t see RAM drives in use for consumers because most of the performance needs depend on sustained reads (as opposed to online transaction processing).

    “SSDs have a finite performance of about 100MB/s for reads.”
    That’s only current consumer level SSDs. The Mtron pros are capping out at 120MB/s sustained read. Intel is rumored to have a 200MB/s drive out in Q2.

    Also, you can put them in a RAID array and get over 800MB/s
    http://www.nextlevelhardware.com/storage/battleship/

    I have a feeling that consumers will start snapping up server grade PCI-E RAID cards and plugging these Intel drives in four at a time to get 800MB/s. You can’t get over 800MB/s from a RAID card without spending big bucks.

  14. Hey Jonathan. Good point about the the 3x overhead.

    There’s also the issue of doing a large update of the data from the net which would use a lot more disk IO.

    Kevin

  15. Hey Kirk,

    Great feedback.

    “The big difference between RAM and Flash SSDs are the random write speeds. They’re still fairly awful for flash though Intel is working on a new breed that are better with writes.”

    Agreed…… though I’m expecting this to be fixed with the next generation of SSDs that will do log structured filesystems. Fusion IO, STEC, and EasyCo all seem to use them.

    “That’s only current consumer level SSDs. The Mtron pros are capping out at 120MB/s sustained read. Intel is rumored to have a 200MB/s drive out in Q2.”

    Agreed, I was computing mean performance between all models.

    For example, a search engine could use cheaper SSDs if they were 99.9% read and they were able to get them in cheap and in bulk.

    “I have a feeling that consumers will start snapping up server grade PCI-E RAID cards and plugging these Intel drives in four at a time to get 800MB/s. You can’t get over 800MB/s from a RAID card without spending big bucks.”

    We’re going to do 3 over software RAID. I’m also not sure that SSDs will perform very well on the current generation of hardware RAID drives as they’re designed for HDDs.

    Kevin

  16. Hey Don,

    That’s strange about Twitter… I approved you.. I’m following you as well.

    Kevin

  17. “I’m also not sure that SSDs will perform very well on the current generation of hardware RAID drives as they’re designed for HDDs.”

    According to that battleship MTron link they scale perfectly. In other words one MTron does 120MB/s and two do roughly 240. That’s using the Pro model though which is designed for servers/RAID arrays so you have a point.

    I didn’t even consider software RAID but that’s a great idea for people who don’t want to install an expensive raid card. If you do a guide on how to install XP on a software RAID array I’ll link to it in a heartbeat.

  18. Hey Kirk,

    My benchmarks are a bit more detailed than the Battleship Mtron benchmarks. You can read them here:

    http://www.engadget.com/tag/SSD

    Specifically, random reads performed very poorly on the LSI Raid controller I tested.

    The best performance was on software RAID 0 with the noop Linux scheduler.

    Random read IO was the biggest performance hit …

  19. Kevin: the best results I’ve had in ultra-high-concurrency environments were from exporting db outputs (which were to be used in AJAX pages) to text CSV files and letting lighttpd squirt them out, with JavaScript at the client end to unpack them. Cut the db and scripting engine both out of the loop. I explained this technique to the local MySQL rep and he was speechless. ;)






%d bloggers like this: