• Contact Us
  • Support
  • Questions? Contact Us Today: 617.630.2800
  • Follow us on Twitter
  • Join our Facebook Group
  • Find us on LinkedIn
  • Subscribe to our RSS Feed
  • Search Site

  • Solutions
    • By Use Case
    • By Industry
    • By Delivery Model
  • Product
    • Product Architecture
    • Scale Out via Data Distribution
    • Scale Out via Read/Write Splitting
    • Data Traffic Management
    • Availability
    • Key Features / Benefits
    • Easing Capacity Planning
    • What is Database Sharding?
  • Resources
    • Datasheets
    • Case Studies
    • Whitepapers
    • Webinars
    • Videos
    • Scale Out Readiness Calculator
    • Database Assessment
    • Benchmark
    • Frequently Asked Questions
    • Documentation
  • Blogs
    • ScaleBase Blog
    • CTO Blog
  • Company
    • Customers
    • Management
    • News
      • Press Releases
      • Coverage
      • Events
    • Partners
    • Investors
    • Careers
    • Contact Us

You are here: ScaleBase / Blog / Architecture / Running TPC-C on MySQL/RDS

Running TPC-C on MySQL/RDS

15 Jun 2011 / 2 Comments / in Architecture, Blog, Performance, ScaleBase/by Paul Campaniello

I recently came across the results of a TPC-C benchmark on MySQL based RDS databases. You can see the paper here. I think the results may throw light on many questions concerning MySQL scalability in general and RDS scalability in particular.

TPC-C

TPC-C is a standard database benchmark, used to measure database performance. Database vendors invest big bucks in running this test and showing off which database is faster, and can scale better.

It is a write intensive test, so it doesn’t necessarily reflect the behavior of the database in your application. But it does give some very important insights into what you can expect from your database under heavy load.

The Benchmark Process

First of all, I have some comments on the benchmark method itself.

  1. Generally, the benchmarking was  orderly and methodical – which increases the credibility of the results.
  2. The benchmark generator client was tpcc-mysql which is an open-source implementation provided by Percona. It’s a good implementation, although I think DBT-2, provided by Apache, is more widely used.
  3. Missing are the mix weights of the different TPC-C transactions in the benchmark.  The TPC-C benchmark contains six types of transactions (“new order”, “delivery”, “stock query”, etc.) and the mix weights are important.
  4. The benchmark focused on throughput, not latency. I think that although TPC-C is mostly about throughput (transactions per minute) it’s recommended to address the latency (“avg response time” for example) as well.
  5. TPC-C “number of warehouses” determines the size of the database being tested. The number of warehouses ranged between one and 32 in this benchmark. In MBs we’re talking 100MB-3GB. That’s usually a small database, and I would be interested in seeing how the benchmark ranks with 1000 warehouses (around 100 GB) or even more.
  6. The entire range of RDS instance types was examined, from small to quadruple Extra Large (4XL). Some very interesting results appeared, which will be the focus of this blog post.

Results Analysis

The benchmark results are surprising.

  1. With hardly any dependency on the database size, MySQL reaches its optimal throughput at around 64 concurrent users. Anything above that causes throughput degradation.
  2. Throughput is improving as machines get stronger. However, there is a sweet-spot, a point where adding hardware doesn’t help performance. The sweet spot is around the XL machine, which reaches a throughput of around 7000 tpm. 2XL and 4XL machines don’t improve this limit.

It would seem that the Scale-Up scalability approach is rather limited, and bounded. And unfortunately, it has some rather close bounds.

So, what’s the bottleneck?

  • CPU is an unlikely candidate, as CPU power doubles on 2XL and quadruples on 4XL machines.
  • I/O is also an unlikely candidate, since machine memory (RAM) doubles on 2XL and quadruples on 4XL machines. More RAM means more database cache buffers, thus reduced I/O.
  • Concurrency? Number of concurrent users? Well, we saw that optimal throughput is achieved with around 64 concurrent sessions on the database. See figure 4b in the benchmark report. With one user, the throughput is 1,000 transactions per user, but with 256 users, it drops to one transaction per user!

Results Explanation

It is definitely weird that when more parallel users are added, we get performance degradation – a lower throughput than the one we have with fewer parallel users.

Well, the bottleneck is the database server itself. A database is a complex, sophisticated machine, with tons of tasks to perform in each and every millisecond it runs. For each query, the database needs to parse, optimize, find an execution plan, execute it, and manage transaction logs, transaction isolation and row level locks.

Consider the following example. A simple update command needs an execution plan to get the qualifying rows to update and then, reading those rows, it must lock each and every row. If the command updates 10k rows – this can take a while. Then, each update is executed on the tables, on each of the relevant indexes of that table, and also written to the transaction log file. And that’s for just a simple update.

Or another example – a complex query that takes a long time to run. Even one second is a lot of time in a highly loaded database. During its run, rows from source table(s) of that query are updated (or added, or deleted). The database must maintain “Isolation level” (see our isolation level blog), meaning that the user’s result must be a snapshot of the query as it was when the query started. New values in a row that was updated after the query began, even if committed, should not be part of the result. Rather, the database should go and find the “old snapshot” of the row, meaning the way the row looked at the beginning of the query. InnoDB stores information about old versions of rows in its tablespace. This information is stored in a data structure called a rollback segment. And if we go back to the previous example – here’s another task the database has to do – update the rollback segment.

We must remember that every command issued to the database actually generates dozens of recursive commands inside the database engine. With increasing sessions or user concurrency, the load inside the database engine increases exponentially. The database is a great machine, but it has its limits.

So What Do We Do?

So, we understand the database is crowded, and has hardware sweet-spots, which complicates the Scale Up solution – which is expensive, and doesn’t give the required performance edge. Even specialized databases have their limits and while the sweet-spot changes every database has one.

There are a lot of possible solutions to this problem – adding a caching layer is a must, to decrease the number of database hits, and any other action that can reduce the number of hits on the database (like NoSQL solutions) is welcome.

But the actual database solution is scale-out. Instead of one database engine, we’ll have several, say ten. Instead of 128 concurrent users or even 256 concurrent users (that, according to the TPC-C benchmark, bring the worst results), we’ll have ten databases with 26 users on each, and each database can reach 64 users (up to 640 concurrent users). And of course, if needed, we can add more databases to handle the increased scale.

It will be interesting to see how Scale Out solutions for databases handle TPC-C. For disclosure, we at ScaleBase are currently running a TPC-C benchmark and will publish our results shortly.

WHAT’S NEXT?

  • Download the ScaleBase Database Load Balancer for a free trial.
  • Read our white paper, and learn about the ScaleBase Database Load Balancer solution.
  • Check out our other posts for more technical info.

2 Responses to Running TPC-C on MySQL/RDS

  1. Liang says:
    June 16, 2011 at 3:11 am

    Hi liran,

    Looking forward to see your results. :-)

    Reply
  2. Une base de données dans le Cloud, bonne idée ? | www.LeGrandBI.com says:
    June 21, 2011 at 8:47 am

    [...] la casse… L’étude Scalebase peut être téléchargée ici. Article liésIngres veut contrer Qliktech par le cloud#FD10 Le Saas c’est la « Hot [...]

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Subscribe to Blog


 

Tweet
database assessment
Popular
  • Running TPC-C on MySQL/RDSJune 15, 2011, 8:45 pm
  • Backing Up MySQL With ScaleBaseSeptember 12, 2011, 8:34 am
  • Can Prepared Statements Improve Your Scalability?May 23, 2011, 5:57 pm
  • MemoryImageSeptember 6, 2011, 10:33 am
Recent
  • MySQL Sharding Blog Series: Does Sharding Make Sense on...May 16, 2013, 10:03 am
  • MySQL Sharding Blog Series: Does Sharding Work in the Cloud?May 13, 2013, 9:45 am
  • MySQL Sharding Blog Series: How to Avoid Re-writing ApplicationsMay 6, 2013, 11:54 am
  • Webinar: TechTalk: Optimize your MySQL sharding by Choosing...May 6, 2013, 9:09 am
Comments
  • [...] business model, and a strong executive. Scalebase,...February 4, 1:49 am by The Investing Edge of Enterprise IT
  • [...] Recently some customers running on Amazon EC2 asked...December 28, 3:27 pm by High Availability for your ScaleBase instance | MySQL | Syngu
  • [...] ScaleBase Releases Database TPC-C Performance Results...December 12, 2:29 pm by ScaleBase achieves 180K NO-TPM TPCC results on... | MySQL | Syngu
  • Hi Milo - I totally agree. You need to understand how you're...September 6, 10:33 am by liran
Tags
Amazon RDS application re-write Big Data Funding cloud sharding Database Availability database capacity database cloud database scalability database scalability strategy database sharding database virtualization data traffic manager Elastic Load Balancing Hadoop high availability Industry News MySQL MySQL 5.6 MySQL Clustering MySQL Scalability MySQL Scalability Issues MySQL scale out MySQL Sharding MySQL Tuning NewSQL next gen apps NOSQL scalability of databases Sharding single point of failure thread pool

Recent Comments

  • The Investing Edge of Enterprise IT on Management
  • High Availability for your ScaleBase instance | MySQL | Syngu on High Availability for your ScaleBase instance

Solutions

  • Solutions
    • Centralized Database Management & Visibility
    • Scalability
    • Database Availability
    • Delivery Models
    • Industry Solutions

Products

  • Product Overview
    • Product Architecture
    • Scale Out via Data Distribution
    • Scale Out via Read/Write Splitting
    • Data Traffic Management
    • Availability
    • Key Features / Benefits
    • Easing Capacity Planning
    • Database Sharding

Resources

  • Datasheets
  • Case Studies
  • Whitepapers
  • Webinars
  • Videos
  • Scale Out Readiness Calculator
  • Benchmark
  • Frequently Asked Questions
  • Documentation
  • Database Assessment

Company

  • Company
    • Customers
    • Management
    • News
    • Partners
    • Investors
    • Careers
© 2012 Scalebase Inc.

  • |Home
  • |Sitemap
  • |Privacy Policy
  • |Contact Us
  • |Blog