So, your application is growing, you’re serving more data, more users, and more transactions — congratulations, you must be doing something right! With this increase in information, however, can your MySQL database scale to handle it all? If not, how can you scale out your MySQL database?
There are several MySQL scalability approaches you might investigate including; read/write splitting, clustering, and putting your database on bigger hardware. Each of these approaches can be helpful, but we believe the ultimate scalability benefits come from deploying a technique called database sharding.
Some of you may have already sharded some of your MySQL databases, if so, then you might be quite familiar with challenges to overcome to reach the database scalability goals of sharding. Sharding is how Facebook, Twitter and Tumblr (to name just a few) achieve massive MySQL scalability — but you may not have the same amount of resources and technical talent to replicate their success on your own.
Due to the complexity of the task at hand, you will find many sources that say “sharding sucks.” We at ScaleBase can only agree, in fact that’s why we exist — to develop software that makes sharding automated and easy while making it easy to reap sharding benefits without its traditional hardships.
A recent webinar — Scale Out MySQL Databases – Sharding Made Easy — discusses sharding challenges, and how ScaleBase’s Data Traffic Manager software helps with their solution. This post is the first part of a series dedicated to describing and overcoming MySQL sharding challenges.
DIY Sharding Requires Application Re-writes – YUCK!
The first challenge of “Do It Yourself” database sharding is in many ways the biggest one: the requirement to re-write portions of your application with hard-coded logic that implements sharding policies you want to employ for your business.
Developing database sharding code has a well-deserved reputation of being a real hassle — but the idea of re-writing your application to do database sharding yourself is not only intimidating, it’s also completely unnecessary.
Replicating data across multiple databases can be a daunting task that will require intensive attention from your best engineers. You need to make sure you do things efficiently or you lose the main benefit of the job — performance and scalability under extreme load.
Then there is the constant risk of losing data consistency and integrity. You’ll have to keep testing for ACID compliance to prevent unexpected results from happening. In addition to the standard functional testing for your application’s business logic, additional testing will need to ensure that even mundane daily tasks such as data consistency checks, backups, and recovery work as expected, too.
Essentially, any MySQL tools that you use today (e.g. ETL, reporting, phpMySQL, mysqldump) will likely be rendered useless if you shard manually, as these products were not designed to look for your DIY shards scattered across multiple servers. You will either have to work with data shard-by-shard, or you will have to implement a new suite of tools that will understand your custom sharding.
The next major challenge is coping with business changes and growth. Is your code universal and robust enough to work without additional modifications and re-testing of your MySQL ecosystem as your shards eventually increase? Will the original developers still be at hand when you need to adjust to growing needs? Will the code be well documented for others to pick up where he left off? We see companies coming to us and asking about alternatives for home-grown sharding because the ongoing maintenance and evolution of their home-grown sharding solutions is more painful than they ever anticipated.
For these (often well known) reasons, DIY sharding is perceived as such a big challenge that it makes many organizations eliminate sharding from their list of scaling strategies altogether. For those that took the step and are currently sharding on their own, constantly maintaining their sharding code typically turns out to become one of their biggest headaches.
Avoid Re-writing Applications – YEAH!
Is sharding a no go then? Are there more acceptable alternatives? We believe there are and we offer a very good one. ScaleBase provides software that automates and simplifies your MySQL database sharding. ScaleBase’s Data Traffic Manager software is a middleware that sits between your application and an array of MySQL databases and does not require you to implement significant application logic changes. Your applications remain basically “sharding-code-free” and your developers can stay focused on business functionality of your applications.
The major conceptual difference between DIY sharding and ScaleBase is, with ScaleBase, your applications experience one large MySQL database even if it is transparently implemented with multiple database instances in the backend. Your entire existing MySQL ecosystem will continue work the same as before. And, since we hired the best talent to implement the impossibly hard sharding well, you don’t have to.
A platform-based solution like ScaleBase DTM can save time and return your investments very quickly because you don’t have to design, code , test, and deploy the sharding layer within your application and its supporting infrastructure. Management of a change is a breeze since DTM allows you to quickly administer sharding policies and resources and accommodate growing operational needs rapidly.
Next time we’ll dig into another DIY sharding “challenge”– sharding Cloud-based MySQL systems and how ScaleBase allows for infinite database scalability in the Cloud.
- Register to watch our Scaling out MySQL Databases – Sharding Made Easy Webinar
- Subscribe to theScaleBase blog to be alerted when new entries are published.
- Contact ScaleBase to learn how ScaleBase can help you with MySQL scalability in a fault tolerant architecture
- Sign up for a freedatabase scalability assessment to find out how your MySQL applications can benefit from scaling out.