Unlimited MySQL Scalability via Automated, Policy-based Sharding
In an OLTP database we have the following: big data, big transaction concurrency, as well as amix of reads and writes – all at any given point in time.
Every write operation is typically at least 5 operations for the database, including table, index(s), rollback segment, transaction log, row-level locking and more. The database engine itself becomes the bottleneck! It puts so many resources into buffer management, locking, thread locks/semaphores, and recovery tasks, that there are no available resources left for handling query data! Now multiply this with 5TB of data and 10,000 concurrent sessions…
The only way to scale OLTP databases is by intelligent distribution of data, as well as the concurrent sessions among numerous database servers. It’s all in the distribution of the data, if data is distributed in a smart, logical way, concurrent sessions can be distributed across servers.
Imagine going from one giant database server with 1TB of data and 1000 concurrent transactions to 10 databases, each with easy 100GB and 100 concurrent transactions. It would be a major win if you were able to keep the databases isolated, shared nothing, processing-wise, not only cables-wise. The best scenario is transactions that start and finish on a single database.
ScaleBase slices the data and distributes across several database servers (using a technique similar to sharding). After examination and analysis of the schema structure, table population, and SQL commands, the data distribution policy is set.
This policy determines how the data is to be split across the database servers. Horizontal partitioning is a database design principle whereby rows of a database table are held separately, rather than splitting by columns (vertical partitioning, as for normalization). Each partition forms part of a database, which may in turn be located on a separate database server or physical location.
Complex SQL commands containing join operations and subqueries are supported by ScaleBase, and are executed successfully on underlying distributed databases. When queries are parallelized across several (or all) backend databases, the result sets are aggregated into a one meaningful result set, just as if it was returned from a single database. ScaleBase supports multiple cases of complex result aggregation such as ORDER BY, GROUP BY, aggregate functions and the use of LIMIT clause.
Result aggregation is always done in “lean memory”, meaning no excessive memory is required by ScaleBase to manage and aggregate results even if overall result set is large.
ScaleBase brings elasticity to the relational database!
ScaleBase supports online data redistribution (using a technique similar to re-sharding). Any number of databases can be added to the database array at any point in time. ScaleBase then automatically redistributes the data, populates the new nodes and ensures even data distribution. This is an online activity, no down time for the apps, small portions of the data are locked (for update) for short periods of time. In the same manner, any number of database nodes can be removed from the array, online.
ScaleBase supports hints to allow the user force behaviors on ScaleBase. It helps customers to change the target in a specific statement and route it to specific database or databases.
One useful hint is the one that enables Breakout Queries. A Breakout Query is a query that is run on all databases but instead of aggregating the result to a single resultset, ScaleBase will bring a combined results set of all results arrived from the underlying servers. For example SELECT COUNT(*). Standard execution would sum all counts from all databases into a single result. Breakout execution will give results out of each database, as seen below:
mysql> SELECT COUNT(*) FROM items; +----------+ | COUNT(*) | +----------+ | 385 | +----------+
mysql> SELECT COUNT(*) FROM items; +---------+-----------+ | ShardID | COUNT(*) | +---------+-----------+ | 1 | 100 | | 2 | 70 | | 3 | 120 | | 4 | 95 | +---------+-----------+