Skip to main content

Database

Date type

Structured data - normalized fasion

normalized data means a unique entity occurs in only one place/table, in its simplest and atomic form and is not spread throughout the database.

Unstractured Data

User state

Relational DB

Transactions & Data Consistency

ACID – Atomicity, Consistency, Isolation, Durability.

Storing Relationships

Relational databases are built to store relationships.

Facebook as the main user-facing database.

MySQL, PostgreSQL, MariaDB, Amazon Aurora, Google Cloud SQL

NoSQL Databases

JSON-based databases for frequency read & writes, typically required in social applications like Twitter, LIVE real-time sports apps, online massive multi-player games etc.

Why no-sql needed from relational DB?

Scalability

Scaling SQL databases is something which is not trivial. They have to be Sharded or Replicated to make them run smoothly on a cluster

NoSQL databases have the ability to add new server nodes on the fly & continue the work,

Clustering

the server nodes even have self-healing capabilities.

Features

Pro

Gentle Learning Curve

Schemaless

Cons

Inconsistency

No ACID

MongoDB, Redis, Neo4J, Cassandra, Google Cloud Datastore

When to use?

Handling A Large Number Of Read Write Operations

Since they have the ability to add nodes on the fly, they can handle more concurrent traffic & big amount of data with minimal latency.

Flexibility With Data Modeling

Eventual Consistency Over Strong Consistency

Running Data Analytics

Database use cases

Time-Series databases, *

Wide-Column* - Cassandra or HBase,

Document Oriented - ElasticSearch

Key-value store - Redis or Memchache

Is NoSQL more performant?

MongoDB, ExpressJS, AngularJS/ReactJS, NodeJs pick NoSQL DB.

Facebook uses MySQL for storing its social graph of millions of users.

Quora uses MySQL by partitioning the data at the application level

Polyglot Persistence to use both SQL and NoSQL

Facebook case study:

facebook-db-example.jpeg

Multi-Model Databases

DB supports Graph, Document-Oriented, Relational

e.g.

Arango DB, Cosmos DB, Orient DB, Couchbase

facebook-db-multi-node-example.jpeg

Eventual consistency

eventual-consistency.jpeg

The upside of eventual consistency is that the system can add new nodes on the fly without the need to block any of them, the nodes are available to the end-users to make an update at all times.

Strong consistency

Locking down the nodes needs to be updated.

strong-consistency.jpeg

CAP theorem, queuing all the requests.

ACID Trasaction support in the geographic zone.

NoSQL for Highly Available and Scalable.

Consitency, Availability, Partition tolerance(CAP)

CAP theorem simply states that in case of a network failure, when a few of the nodes of the system are down, we have to make a choice between Availability & Consistency

Types of Databases

Document-Oriented database

Key-value datastore

minimum latency to implement chaching.

Popular. Redis, Hazelcast, Riak, Voldemort & Memcache.

Wide-column database

primarily used to handle massive amounts of data, technically called the Big Data

analytical use cases, ensuring scalability, performance & high availability at the same time.

column-oriented databases wide-column databases store data in a record with a dynamic number of columns. A record can hold billions of columns

Popular: Cassandra*, HBase, Google BigTable, Scylla DB

Netflix uses Cassandra as the backend database for the streaming service

Adobe uses HBase for processing large amounts of data

Relational database

Graph database

Features:

visualization

Low latency: no need to join in SQL, just follow the edges. e.g. Google Maps, Popular . Neo4J*

Use cases: building social, knowledge, network graphs. Writing AI-based apps, recommendation engines, fraud analysis app, storing genetic data etc.

graph-db.jpeg

Walmart shows product recommendations to its customers in real-time using Neo4J graph database

NASA uses Neo4J to store “lessons learned” data from their previous missions to educate the scientists & engineers.

Time-Series database

optimized for tracking & persisting time series data

Use cases: data from IoT devices, self-driving vehicles, industry sensors, social networks, stock market financial data

Writing an autonomous trading platform which deals with changing stock prices in real-time etc

fetching data from IoT devices. Managing data for running analytics & monitoring

Primary usage: running analytics, deducing conclusions and making future business decisions looking at the results of the analytics.

Popular : Influx DB*, Timescale DB, Prometheus

IBM uses Influx DB to run analytics for real-time cognitive fraud detection

Spiio uses Influx DB to remotely monitor vertical lining green walls & plant installations.

Databases dedicated to mobile apps

Five of the Most Popular Databases for Mobile Apps - Trigent Vantage

The Types of Modern Databases | Alooma