I recently came across the results of a TPC-C benchmark on MySQL based RDS databases. You can see the paper here. I think the results may throw light on many questions concerning MySQL scalability in general and RDS scalability in particular.
TPC-C is a standard database benchmark, used to measure database performance. Database vendors invest big bucks in running this test and showing off which database is faster, and can scale better.
It is a write intensive test, so it doesn’t necessarily reflect the behavior of the database in your application. But it does give some very important insights into what you can expect from your database under heavy load.
The Benchmark Process
First of all, I have some comments on the benchmark method itself.
- Generally, the benchmarking was orderly and methodical – which increases the credibility of the results.
- The benchmark generator client was tpcc-mysql which is an open-source implementation provided by Percona. It’s a good implementation, although I think DBT-2, provided by Apache, is more widely used.
- Missing are the mix weights of the different TPC-C transactions in the benchmark. The TPC-C benchmark contains six types of transactions (“new order”, “delivery”, “stock query”, etc.) and the mix weights are important.
- The benchmark focused on throughput, not latency. I think that although TPC-C is mostly about throughput (transactions per minute) it’s recommended to address the latency (“avg response time” for example) as well.
- TPC-C “number of warehouses” determines the size of the database being tested. The number of warehouses ranged between one and 32 in this benchmark. In MBs we’re talking 100MB-3GB. That’s usually a small database, and I would be interested in seeing how the benchmark ranks with 1000 warehouses (around 100 GB) or even more.
- The entire range of RDS instance types was examined, from small to quadruple Extra Large (4XL). Some very interesting results appeared, which will be the focus of this blog post.
The benchmark results are surprising.
- With hardly any dependency on the database size, MySQL reaches its optimal throughput at around 64 concurrent users. Anything above that causes throughput degradation.
- Throughput is improving as machines get stronger. However, there is a sweet-spot, a point where adding hardware doesn’t help performance. The sweet spot is around the XL machine, which reaches a throughput of around 7000 tpm. 2XL and 4XL machines don’t improve this limit.
It would seem that the Scale-Up scalability approach is rather limited, and bounded. And unfortunately, it has some rather close bounds.
So, what’s the bottleneck?
- CPU is an unlikely candidate, as CPU power doubles on 2XL and quadruples on 4XL machines.
- I/O is also an unlikely candidate, since machine memory (RAM) doubles on 2XL and quadruples on 4XL machines. More RAM means more database cache buffers, thus reduced I/O.
- Concurrency? Number of concurrent users? Well, we saw that optimal throughput is achieved with around 64 concurrent sessions on the database. See figure 4b in the benchmark report. With one user, the throughput is 1,000 transactions per user, but with 256 users, it drops to one transaction per user!
It is definitely weird that when more parallel users are added, we get performance degradation – a lower throughput than the one we have with fewer parallel users.
Well, the bottleneck is the database server itself. A database is a complex, sophisticated machine, with tons of tasks to perform in each and every millisecond it runs. For each query, the database needs to parse, optimize, find an execution plan, execute it, and manage transaction logs, transaction isolation and row level locks.
Consider the following example. A simple update command needs an execution plan to get the qualifying rows to update and then, reading those rows, it must lock each and every row. If the command updates 10k rows – this can take a while. Then, each update is executed on the tables, on each of the relevant indexes of that table, and also written to the transaction log file. And that’s for just a simple update.
Or another example – a complex query that takes a long time to run. Even one second is a lot of time in a highly loaded database. During its run, rows from source table(s) of that query are updated (or added, or deleted). The database must maintain “Isolation level” (see our isolation level blog), meaning that the user’s result must be a snapshot of the query as it was when the query started. New values in a row that was updated after the query began, even if committed, should not be part of the result. Rather, the database should go and find the “old snapshot” of the row, meaning the way the row looked at the beginning of the query. InnoDB stores information about old versions of rows in its tablespace. This information is stored in a data structure called a rollback segment. And if we go back to the previous example – here’s another task the database has to do – update the rollback segment.
We must remember that every command issued to the database actually generates dozens of recursive commands inside the database engine. With increasing sessions or user concurrency, the load inside the database engine increases exponentially. The database is a great machine, but it has its limits.
So What Do We Do?
So, we understand the database is crowded, and has hardware sweet-spots, which complicates the Scale Up solution – which is expensive, and doesn’t give the required performance edge. Even specialized databases have their limits and while the sweet-spot changes every database has one.
There are a lot of possible solutions to this problem – adding a caching layer is a must, to decrease the number of database hits, and any other action that can reduce the number of hits on the database (like NoSQL solutions) is welcome.
But the actual database solution is scale-out. Instead of one database engine, we’ll have several, say ten. Instead of 128 concurrent users or even 256 concurrent users (that, according to the TPC-C benchmark, bring the worst results), we’ll have ten databases with 26 users on each, and each database can reach 64 users (up to 640 concurrent users). And of course, if needed, we can add more databases to handle the increased scale.
It will be interesting to see how Scale Out solutions for databases handle TPC-C. For disclosure, we at ScaleBase are currently running a TPC-C benchmark and will publish our results shortly.