In a recent webinar entitled, From Relational To Riak, we talked about why and how companies move from Relational database to using Riak. Relational databases are still a part of the technology stack for many companies while others are innovating and incorporating NoSQL solutions either as a replacement for relational databases or alongside them. As a result, they have simplified their deployments, enhanced their availability, and reduced their costs.
We will continue exploring the theme of the management of Big Data applications, and understanding the infrastructure that powers them, in an upcoming webinar entitled Simplicity Scales – Big Data Application Management & Operations on June 16, 2015 at 10am Pacific.
In the webinar we start by discussing the long and successful history of RDBMS implementations. The first paper that proposed the use of the relational database model, entitled A Relational Model of Data for Large Shared Data Banks, was written by E.F. codd and released in 1970. Since that point, the RDBMS has been the standard by which all other datastores are measured.
In the webinar we discuss several of the current industry trends that are driving NoSQL adoption..and the key reasons that companies choose, a NoSQL technology, such as Riak.
Understanding the requirement for High Availability, begins with understanding the immense cost of downtime. We have written about this concept frequently in the past including an infographic (Down with Downtime) and a blog post (The Requirement of High Availability)
Relational databases typically address the challenge of availability with a master/replica architecture, where the topology of a cluster is comprised of a single master and multiple replicas. Under this configuration, the master is responsible for accepting all write operations and coordinating with replicas to apply the updates in a consistent manner. Read requests can either be proxied through the master or sent directly to the replicas.
In contrast, Riak is a masterless system designed for high availability, even in the event of hardware failures or network partitions. Any server (termed a “node” in Riak) can serve any incoming request and all data is replicated across multiple nodes. If a node experiences an outage, other nodes will continue to service read and write requests.
These concepts, and a case study of a customer who had to address the challenges, are discussed in greater detail in the webinar.
In a fashion similar to high availability, the approach to scalability adopted by RDBMS systems and Riak differ significantly. Relational databases scale by increasing the server and storage capacity. Special, expensive versions of the database may technically distribute load across multiple machines, but they rely on shared storage that often becomes a bottleneck. These systems are not designed to run on commodity hardware.
Another way RDBMS (and some NoSQL systems) scale is to use sharding. Sharding distributes data across several database servers. A common example of this would be putting your user data for differing geographical regions (e.g., US and EU) on different machines, or using an alphabetical or numerical order to split data. While this seems simple, sharding is complex and inherently inflexible.
In Riak, data is automatically distributed evenly across nodes using consistent hashing. Consistent hashing ensures data is evenly distributed around the cluster and new nodes can be added with automatic, minimal reshuffling of data. This significantly decreases risky hot spots in the database and lowers the operational burden of scaling.
In this section of the webinar, in addition to discussing the concepts at length, we provided the case study of a customer who experience “viral growth” and was able to achieve scalability without downtime.
Flexible Data Model
In what was the most developer centric portion of this session, we discussed the Relational data model as it compares to the Key/Value model provided by Riak and provided an example of data modeling. In specific, the migration of a simple posts database (that stores blog content such as this) from PostgreSQL to Riak.
In a relational database, data is organized by tables that are separate and unique structures. Within these tables exist rows of data organized into columns. As such, interaction with the database is by retrieving or updating entire tables, individual rows, or a group of columns within a set of rows.
In contrast, Riak has a much simpler data model. An Object is both the largest and smallest element of data. As such, interaction with the database is by retrieving or modifying the entire object. There is no partial fetch or update of the data.
Keys in Riak are simply a binary value (or a string) that are used to identify Objects. The Key/Value pair (or Object) is stored in a higher level namespace called a Bucket. And, with Riak 2.0, there is an extra layer of abstraction known as Bucket Types.
This Key/Value/Bucket model enables broad flexibility in modeling the applications data domain with Riak as the data store for persistence.
We have written, at length, about data modeling with Riak and there are several resources that are valuable additional context to the material provided in the webinar. These include:
Relational Databases have:
- Foreign keys and constraints
- Sophisticated query planners
- Declarative query language (SQL)
- A Key/Value model where the value is any unstructured data
- More data redundancy that provides better availability
- Eventual consistency
- Simplified query capabilities
- Riak Search
What you will gain:
- More flexible, fluid designs
- More natural data representations
- Scaling without pain
- Reduced operational complexity
For more information on moving from Relational to Riak download our whitepaper, or to chat with a member of the Riak team on the topic, please request a Tech Talk. You can also learn more at the upcoming webinar Simplicity Scales – Big Data Application Management & Operations.