Skip to content

Commit

Permalink
add a section on scalable relational databases
Browse files Browse the repository at this point in the history
  • Loading branch information
dimitri-yatsenko committed Aug 25, 2024
1 parent 1951ba8 commit 472a879
Showing 1 changed file with 36 additions and 18 deletions.
54 changes: 36 additions & 18 deletions book/02-concepts/02-database.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,32 +44,50 @@ A robust DBMS enforces such rules reliably, ensuring smooth operations, while in

Databases are dynamic, with data continuously updated by both users and systems. Even in the face of disruptions like power outages, errors, or cyberattacks, the DBMS ensures that the system recovers quickly and returns to a stable state. For users, the database should function seamlessly, allowing actions to be performed without interference from others working on the system simultaneously.

# Data models for databases
Databases have been built on a variety of data models.
As Guy Harrison describes in his 2015 book "Next Generation Databases" [@10.1007/978-1-4842-1329-2], the database industry has undergone three major revolutions:
1. Pre-relational (1950-1972)
2. Relational (1972-2005)
3. The Next Generation (2005-future)
# Data Models for Databases

The impact of the relational data model has been so great that the last two revolutions in databases have been defined by first embracing and then diverging from the relational model.
Databases have evolved through various data models over the decades. As Guy Harrison outlines in his 2015 book, *Next Generation Databases* [@10.1007/978-1-4842-1329-2], the database industry has experienced three major revolutions:

The NOSQL revolution, starting in the early 2000s, was propelled by a few factors:
* The need to scale databases beyond the capabilities of the existing relational database management systems at the time.
* The need to represent data structures that are difficult to express in relational constructs: e.g. vectors, JSON documents, data streams.
* The need for simpler data models where relational databases were simply overkill of complexity: e.g. key-value stores.
1. **Pre-relational (1950-1972)**
2. **Relational (1972-2005)**
3. **The Next Generation (2005-future)**

This led to an explosion of new database architectures.
The relational data model has had a profound impact, shaping the last two revolutions in database technology.
Initially, the industry embraced the relational model, which offered a structured, standardized way to organize and query data.
However, as data needs evolved, the limitations of the relational model prompted the rise of alternative approaches, leading to the NOSQL revolution in the early 2000s.

In the meantime, relational databases did not stay in place, adopted new capabilities for scalability and versatility.
## The NOSQL Revolution

A modern relational database management system can accommodate diverse data models and serve a variaty of data handling jobs that can reasonably replace a variety of software systems.
The NOSQL movement emerged in response to several key challenges:

A number of articles describe how one can simplify system design replacing many components with a relatioanl datababse: [Just use Postgres for everything](https://www.amazingcto.com/postgres-for-everything/):
- **Scalability:** The need to scale databases beyond the capabilities of existing relational database management systems (RDBMS) at the time.
- **Diverse Data Structures:** The necessity to represent data structures that are difficult to express in relational terms, such as graphs, JSON documents, and data streams.
- **Simplicity:** The demand for simpler data models where the complexity of relational databases was unnecessary, such as key-value stores.

:::{iframe} https://www.youtube.com/watch?v=lYsQ_riVC4Y
This revolution led to an explosion of new database architectures, each tailored to specific use cases that traditional relational databases struggled to address.

## Evolution of Relational Databases

Despite the rise of NoSQL, relational databases have not remained static.
They have evolved to incorporate new capabilities for scalability and versatility.
Modern relational database management systems (RDBMS) are now highly adaptable, accommodating diverse data models and handling a wide range of data management tasks.
In many cases, they can replace a variety of specialized software systems, simplifying system design.

An example of this adaptability is the growing trend of using relational databases to streamline system architectures, as highlighted in articles like [“Just Use Postgres for Everything”](https://www.amazingcto.com/postgres-for-everything/).

:::{iframe} https://www.youtube.com/embed/lYsQ_riVC4Y
:width: 100%
Just use Postgres for everything
:::

The website https://db-engines.com/en/ranking keeps track of the most popular DBMS.
The relational data model dominates, although many of the popular database systems support other models, allowing for deviations from the relational data model.
## Scalable Architectures in Relational Databases

To meet the growing demand for scalable architectures, relational databases have evolved to incorporate distributed systems. These systems use consensus algorithms, such as Paxos and [Raft](https://www.usenix.org/conference/atc14/technical-sessions/presentation/ongaro), to ensure data consistency across globally distributed, high-performance databases. Notable examples of these advanced systems include Google Spanner [@10.1145/3035918.3056103] and CockroachDB [@10.1145/3318464.3386134].

Since its meteoric rise between 2008 and 2015, the term "NoSQL" has gradually fallen out of favor, as it no longer effectively describes the diverse landscape of modern databases. Today, we operate in a world with multiple data models, where the relational model remains dominant due to its mathematical rigor and versatility. However, it now coexists with more specialized and simpler models that cater to specific use cases.

## Current Landscape of Database Models

The website [DB-Engines Ranking](https://db-engines.com/en/ranking) tracks the popularity of various database management systems. While the relational data model continues to dominate, many popular databases now support multiple data models, allowing for deviations from strict relational structures.

Notably, the two most popular open-source relational databases, MySQL (along with its sister MariaDB) and PostgreSQL, remain at the forefront of this evolving landscape.

0 comments on commit 472a879

Please sign in to comment.