Frequently Asked Questions
Is ScaleBase’s distributed database ACID compliant?
Yes. ScaleBase software preserves full ACID properties of MySQL. It supports two-phase commit and rollback across a distributed relational database, thereby maintaining ACID compliance and full relational integrity, even for transactions that require multiple MySQL instances.
Does ScaleBase’s distributed database support SQL’s query model?
Yes. ScaleBase’s distributed database management system supports all SQL queries, including cross-shard operations, ORDER BY, GROUP, BY, LIMIT, and more. ScaleBase also provides automated aggregation of query results for complex queries that require data from multiple shards (we call shards “clusters”).
Is ScaleBase’s distributed database transparent to my applications?
Yes. ScaleBase is completely transparent to your applications:
- Access transparency – clients are unaware of the distribution of data.
- Location transparency – clients see a single database namespace and data can be relocated without any changes to the application.
- Concurrency control transparency – users and applications can access shared data without interference between each other.
- Replication transparency – clients are unaware if replica copies of the data exist.
- Failure transparency – ScaleBase is fault-tolerant, allowing users and applications to complete their tasks despite the failure of hardware or software components.
- Migration transparency – users are unaware of the movement of data or processes within the system.
- Scaling transparency – scale out / scale in occurs without any changes to applications.
How does ScaleBase scale your database?
ScaleBase implements transparent horizontal partitioning of data into multiple independent database instances, a concept also known as sharding. Sharding is a well-known method for scaling databases, but it’s very difficult to implement as a DIY project. ScaleBase technology automates the process – so your database scales, and you don’t have to manage any of the difficulty and complexity associated with it directly in your application code.
Are all database servers the same? Duplicate copies?
No. The additional databases are not plain copies. Each holds a portion of the overall data, each is independent of each other. That makes it possible for ScaleBase to access them in parallel without interference, efficiently increasing the total throughput with every database instance added to the system. Making all databases the same would not be a good scaling mechanism. Your application would instantly run into bottlenecks, especially when scaling the writes .
What is the best way to split and consolidate the data among the database servers?
That’s the most difficult decision to make – how to split the data. ScaleBase Analysis Genie lets you quickly analyze your schema and application transactions, and then it automatically builds a sharding configuration optimized for your application. ScaleBase Analysis Genie will also simulate your transaction load with the new policy, so you will understand what scalability benefits to expect even before your install ScaleBase in your environment.
Does splitting and consolidating data among multiple database servers really improve database scaling?
Yes, absolutely. Sharding has proven itself to be the best way to scale out databases. Just have a quick look at our TPCC performance report, or some testimonials from people who implemented our sharding solution. It should give you a good feel for the kind of performance and scaling you can expect. The biggest drawback of manual sharding is the technical difficulty of its implementation – a problem that ScaleBase and its automated and intelligent data distribution solve for our customers.
Do I have to make changes to my application?
No. ScaleBase is a transparent solution so it supports the existing SQL commands and database schemas in your application. However the Analysis Genie might suggest making some changes as an optimizing technique, to improve application performance.
How do you rebalance the shards?
ScaleBase offers Continuous Re-distribution of Data a.k.a. CoRD that rebalances the shards when needed. This smart utility moves small chunks of data at any time, keeping the database operational and minimizing impacts on running applications. Re-balancing works both ways – both when increasing and decreasing number of shards .
How do you know that you need to rebalance the shards?
ScaleBase offers instrumentation in the GUI that lets operators see how statements are distributed among individual clusters and detect hotspots. Besides that, since ScaleBase is completely transparent for traditional MySQL tools, many existing monitoring, reporting and analytical tools can be used to analyze the load and utilization of individual clusters directly, too.
Do I need to follow special processes for DDL execution?
No. Just connect your client to ScaleBase directly, and ScaleBase will automatically run the DDL commands on all databases on your behalf.
Do I need to install anything on my database machine?
No. ScaleBase requires no agent installation on your database machine – your machines remain untouched.
Do I need to install anything on my client machines?
No. With ScaleBase, you get to keep your existing drivers, libraries, application servers or serialization frameworks. ScaleBase implements 100% MySQL network protocol and is completely indistinguishable from a MySQL instance to existing applications and clients.
Can I keep my current database monitoring software?
Yes. ScaleBase takes nothing away from its database instances. You can keep monitoring all your MySQL instances directly as usual. On top, ScaleBase will get you GUI with indication of availability and distribution of load of all databases in the system.
Is ScaleBase a new bottleneck and single point of failure in my architecture?
No. You will be advised to always deploy a pair or more ScaleBase Controllers in order to eliminate risks of a bottleneck and a single point of failure. You can distribute the load and implement failover either in your application database configuration (primary/secondary DB), or with JDBC driver load balancing, or with a standard TCP based load balanced such as F5 or AWS Elastic Load Balancer.
Does ScaleBase handle database high availability?
Yes. By utilizing replication, each primary (master) database server can have one or more backup (slave) databases. When ScaleBase detects that the primary database server is down, it automatically directs traffic to the most up-to-date backup database. This ensures that the application continues working, without any errors or failures.
Does ScaleBase monitor the replication status?
Yes. ScaleBase monitors the replication status, so we don’t failover to a backup server that is not up-to-date. Since replication is asynchronous, backup databases can hold information that is not current (lag time between the backup and the primary database server). ScaleBase lets you define what an acceptable lag-time is, and will not fail over if the lag-time is too big. In any case, ScaleBase always fails over to the backup database server that is most up-to-date.
Does each primary database server have to have a backup database server? Does it have any limits for backup servers?
No, it is not mandatory to have at least a single backup server for each primary server, but it’s highly recommended. You can have as many backup servers as you like, with different roles for different use. Some may be local, some in a remote location for disaster recovery. ScaleBase lets you implement a wide range of HA and DR scenarios for your databases.
Can backup database servers be used while in backup mode?
Yes. If so configured, ScaleBase can direct writes to the primary database server database, and reads to the backup databases. It’s a great way to scale your database read load.
What happens when the primary database server fails?
When the primary database server fails, ScaleBase detects this outage and determines whether it is actually down with a few additional tests to make sure a random brown out does not trigger a costly failover. ScaleBase then automatically determines the most up-to-date backup database and fails over to it, it becomes the new master. The original, primary database server is marked as down and will not get any traffic until database administrators perform a complete failback.
What happens if the primary database server comes back to life?
Nothing will happen automatically, but the administrators can nominate it to become the primary server again after its data gets re-synced. Since many changes might have happened during the time the primary was down, ScaleBase does not access the primary again without a user intervention.