The Cloud demonstrates its vast elasticity when it comes to the effortless provisioning of IT resources, spin-up of new instances, and accommodation of computing power to specific loads for short term or long term projected traffic. Essentially, everything in the cloud can grow or shrink in accordance to the demand for a service or application. Take Netflix, for example, with the cloud’s pay-as-you-go rule of thumb and quick capacity expansion or compression, the leading online movie and TV show streaming service is able to use commodity hardware, which translates into smaller, cheaper machines. These consequently result in a lower cost of IT operations. While that is all well and good for many aspects of cloud computing, does it apply to the database, as well?
The Challenge: Horizontal Scaling
While the cloud seems too good to be true, it almost is. The catch is successfully creating and utilizing an elastic database. Merely storing an existing database via a cloud provider does not, in fact, mean that the database is elastic. On the contrary, relational databases possess nowhere near the level of elasticity as other cloud services such as compute or storage. The relational database is the perfect examples of monolithic, single source of data that is viewed the same way now as it was ten years ago, with the ability to scale vertically only.
One of the main advantages of a relational database is its consistency. Multiple mechanisms, such as row level locking and semaphores, are put in place to support numerous concurrent sessions, enabling users with the ability to access the same database, and even different rows within the same table, simultaneously. However, while this plus side is unheard of in other file systems, it still creates issues when all aspects of a cloud-based application can scale out with the exception of the database.
In order to achieve database elasticity, a database needs to scale out. To scale out, a database needs to be partitioned, with its data distributed across multiple shards. Once data is distributed, the application needs to know exactly which database partition holds related data in the various shards across however many servers necessary. In addition, the application needs to be able to immediately update the corresponding databases accordingly, ensuring ACID (Atomicity, Consistency, Isolation, Durability). (Ensuring ACID compliance is a big deal. And that’s why some other vendors have decided to give up on ACID, but that’s a topic for another blog!) Each database then becomes a center of truth for a portion of the data, causing the application to become the center of truth for the entire database. Given that database elasticity is not provided by the cloud, a gap is then left in the effortlessness of cloud operations.
Two Examples Coming from Amazon Cloud
1. Is AWS RDS elastic?
‘Isn’t RDS the solution to this problem?’ The answer is simple, relational database as a service (RDS) can grow and shrink vertically to bigger and smaller machines, but data still lives in one single database instance, replicated between machines or hosted in a large instance. This is contrary to the cloud’s fundamental concept of growing with demand. In basic terms; with relational databases, your real scaling options are to scale up. Needless to say, this can be performed by your cloud vendor with minimal downtime; however, the truth of the matter is that increasing the size of a single machine that is meant to hold an entire database clearly illustrates how the traditional database scalability limitation continues the be an issue even in the cloud.
Another option offered by Amazon is the creation of additional instances that hold read-only replicas of the database. The databases are still monolithic, containing the exact same data, however, due to their secondary features, there is a slight lag when reading the data in comparison to the master database, which is read and write. While this is considered to be a form of scaling out, it is still not truly elastic since it requires making changes to the application to split its reads from its writes, and handling lag in replicas. Secondly, read-only replica cannot be considered as a solution for database elasticity, for the reason that whole data is multiplied, not shared and distributed. Likewise, the workload is not distributed since writes continue piling up in the master database eventually saturating it, leaving no elastic solution in sight.
2. DynamoDB, a great example of an elastic database
So, how should a database service in the cloud actually work? Again, Amazon has broken new ground with its non-relational (NoSQL) database, DynamoDB. This truly elastic database service has the ability to grow and shrink without involving the user in logistical IT decisions. The only effort made by the user is to state the amount of data to be utilized along with the expected service workload. After that is established, the amount of throughput and available space should be calculated considering the fact that each user pays per gigabyte used (pure on-demand) and amount of transactions per second. Aside from that, each user is given a single point of access to connect to their database, corresponding with the essentials of cloud computing. In terms of service provision, DynamoDB is definitely speaking the language of the cloud.
Scalebase: Distributed, Elastic, and Scalable
But what about elasticity for relational databases? To effectively scale relational databases, only Scalebase has the technology to take a database and make it elastically scalable in the cloud, offering practical solutions to achieve the same level of service provision as NoSQL service providers. Essentially, ScaleBase does for RDS what Dynamo has done for NoSQL.