Understanding Table Policies
ScaleBase splits the data across several database servers using a technology called sharding. After examination and analysis of the schema structure, table population, and SQL commands, a sharding policy or a distribution policy is set. This policy determines how the data is to be split across the database servers, which tables are split tables, which are global tables, and which are master tables.
SPLIT TABLES
A split table is a table whose data is distributed across several database servers. The table structure is the same on all the database servers, but the data is divided between them.
These are usually large tables, or tables that are subject to many write operations. Since each database contains a portion of the data, write operations (including index updating) are much faster.
So, for example, configuring an Employees table as a split table might result in the following data distribution:
| Database A | Database B | Database C | Database D | ||||||||||||||||||||||
|
|
|
|
With ScaleBase:
- The splitting key can be any single column of the split table.
- The splitting key can be of any data type, including numeric, character, date, except for BLOB, CLOB.
- The splitting key cannot be updated after the row is inserted into a table. Such “row migration” will be supported in future releases.
- The splitting policy can be:
- Hash: a function that is evaluated using the split key determines the target database this row should populate. This option is useful to evenly distribute the rows across several databases, using an ID as the split key. Use of a numeric split key is recommended for this type of splitting policy.
- Range: in which the splitting policy determines which ranges of rows will populate which database. This option is useful for dates and codes.
- List: a use case of the Range policy. The splitting policy determines which rows will populate which database. This option is useful when you need to distribute specific data to specific databases for security reasons or SLA.
GLOBAL TABLES
A global table is a table that is replicated, structure and data, on all database servers.
These are usually smaller tables, less often updated, which are used in join operations with the split tables.
In our example, had we configured the Departments table as a global table, we would expect the following data distribution:
| Database A | Database B | Database C | Database D | ||||||||||||||||||||||||||||||||
|
|
|
|
With ScaleBase:
- Data transportation and replication for global tables is done either by ScaleBase or by an external replication mechanism such as MySQL replication (“replicated global” table type).
- DDL and maintenance commands are distributed to all database instances.
- Reads are done from a single database.
- Data manipulation (DML: Update, Insert, Delete) is done on all databases.
- Execution of DDL and DML run in a “master first” manner: after the command succeeds on the master database, it runs in parallel on all others, ensuring data consistency, row level locks, and transaction serialization on this one master database, and avoiding race conditions when running in parallel.
MASTER TABLES
Master tables are tables that are maintained only once in a single database.
These tables are not used in joins with split tables, and generally hold metadata or peripheral data with no significant data or session concurrency volume.
In our example, had we configured the States table as a global table, we would expect the following data distribution:
| Database A | Database B | Database C | Database D | ||||||||||
|
Print This Page


