What is a NoSQL database?
NoSQL databases are specifically designed for specific data models and have flexible schemas that allow you to develop modern applications. NoSQL databases are widely used for simplifying the development, functionality, and performance at any scale and for any online service: from an online clothing store like ASOS to a college paper writing service like EssayShark, where students receive qualified college paper help from the writers. They use various data models, including document, graph, search, using key-value pairs and storing data in memory.
How does the NoSQL database (non-relational database) work?
As was already mentioned, In NoSQL databases, various data models are used to access and manage data, including document, graph, search, using key-value pairs and data storage in memory. These types of databases are optimized for applications that work with large amounts of data, need low latency and flexible data models. All this is achieved by mitigating stringent data consistency requirements for other types of databases.
Let's consider an example of a typical schema for a simple book database.
- In a relational database, a book entry is often divided into several parts (or “normalized”) and stored in separate tables, the relationships between which are determined by the constraints of the primary and foreign keys. In this example, in the “Books” table there are columns “ISBN”, “Book Name” and “Issue Number”, in the “Authors” table - columns “Author's ID” and “Author Name”, and in the “Author – ISBN” table - columns "Author" and "ISBN". The relational model is designed to ensure the integrity of the reference data between tables in the database. Data is normalized to reduce redundancy and generally optimized for storage.
- A book record in the NoSQL database is usually stored as a JSON document. For each book, or element, the values “ISBN”, “Book Title”, “Publication Number”, “Author Name, and“ Author ID ”are stored as attributes in a single document. In this model, data is optimized for intuitive design and horizontal scalability.
Why do we need NoSQL databases for?
NoSQL databases are well-suited for many modern applications, such as mobile, gaming, and Internet applications, when flexible, scalable databases with high performance and rich functionality that can provide maximum usability are required.
Flexibility
As a rule, NoSQL databases offer flexible schemas that allow you to develop faster and provide an opportunity for phased implementation. Thanks to the use of flexible data models, NoSQL databases are well suited for partially structured and unstructured data.
Scalable
NoSQL databases are designed to scale using distributed clusters of hardware, rather than adding expensive reliable servers. Some cloud service providers perform these operations in the background, providing a fully managed service.
High performance
NoSQL databases are optimized for specific data models (for example, document, graph, or using key-value pairs) and access patterns, which allows achieving higher performance compared to relational databases.
Wide functionality
NoSQL databases provide APIs and data types with broad functionality that are specifically designed for relevant data models.
NoSQL database types
DB based on key-value pairs.
Databases using key-value pairs maintain high separability and provide unprecedented horizontal scaling unattainable with other types of databases. Good examples of use for key-value databases are gaming, adware and IoT applications. Amazon DynamoDB ensures stable database operation with a delay of no more than a few milliseconds at any scale. Such stable performance was the main reason for transferring Snapchat Stories to the DynamoDB service, since this possibility of Snapchat is associated with the greatest load on the record in the storage.
Document.
In application code, data is often represented as an object or document in a format similar to JSON, since for developers it is an efficient and intuitive data model. Document databases allow developers to store and retrieve data in a database using the same document model that they use in the application code. The flexible, semi-structured, hierarchical nature of documents and document databases allows them to evolve in accordance with the needs of applications. The document model works well in directories, user profiles, and content management systems, where each document is unique and changes over time. Amazon DocumentDB (compatible with MongoDB) and MongoDB are common document databases that provide functional and intuitive APIs for agile development.
Graph database.
Graph databases simplify the development and launch of applications that work with complex data sets. Typical examples of using graph databases are social networks, recommendation services, fraud detection systems and knowledge graphs. Amazon Neptune is a fully managed graph database service. Neptune supports the Property Graph and Resource Description Framework (RDF), providing two graph APIs to choose from: TinkerPop and RDF / SPARQL. Common graph databases include Neo4j and Giraph.
DB in memory.
Often, gaming and advertising applications use leaderboards, real-time session storage and analytics. Such capabilities require a response within a few microseconds, with a sharp increase in traffic possible at any time. Amazon ElastiCache offers Memcached and Redis for processing high-performance, low-latency workloads that cannot be processed using disk storage. Such workloads are characteristic, for example, of the McDonald’s network. Another example of a specially developed data warehouse is Amazon DynamoDB Accelerator (DAX). DAX allows DynamoDB to read data several times faster.
Search databases.
Many applications generate logs to make it easier for developers to troubleshoot and fix problems. Amazon Elasticsearch Service (Amazon ES) is a specially developed service for visualizing and analyzing automatically generated data streams in near real-time mode by indexing, aggregating, and partially searching for structured journals and metrics. Amazon ES is also a powerful high-performance full-text search engine. Expedia employs more than 150 Amazon ES domains, 30 TB of data and 30 billion documents for a variety of particularly important use cases - from operational monitoring and troubleshooting to tracking a stack of distributed applications and optimizing costs.
SQL (relational) and NoSQL (non-relational) comparison
For decades, the relational data model, which has been used in relational databases such as Oracle, DB2, SQL Server, MySQL and PostgreSQL, has been central to the development of applications. But in the mid-to-late 2000s, other data models began to gain noticeable spread. To refer to the emerging classes of database and data models, the term “NoSQL” was introduced. Often "NoSQL" is used as a synonym for the term "non-relational."
There are many types of NoSQL databases with various features, but the table below shows the main differences between NoSQL databases and SQL.
SQL databases |
NoSQL Databases |
|
Suitable workloads |
Relational databases are designed for transactional and highly consistent real-time transaction processing (OLTP) applications and are well suited for real-time analytic processing (OLAP). |
NoSQL databases (based on key-value pairs, document, graph, and in-memory) are focused on OLTP for a variety of data access patterns, including low-latency applications. NoSQL search databases are intended for analytics of partially structured data. |
Data model |
The relational model normalizes the data and converts it into tables consisting of rows and columns. A schema rigidly defines tables, rows, columns, indexes, relationships between tables, and other database items. Such a database ensures the integrity of the reference data in the relationship between the tables. |
NoSQL databases use various data models, including document, graph, search, using key-value pairs and storing data in memory. |
ACID properties |
Relational databases provide a set of ACID properties: atomicity, consistency, isolation, reliability. - Atomicity requires that the transaction be performed completely or not executed at all. - Consistency means that immediately after a transaction is completed, the data must follow the database schema. - Isolation requires that parallel transactions are performed separately from each other. - Reliability refers to the ability to recover to the last saved state after an unexpected system failure or power failure. |
NoSQL databases often offer a compromise, softening the stringent requirements of ACID properties for the sake of a more flexible data model that allows horizontal scaling. Due to this, NoSQL DB is an excellent choice for high bandwidth and low latency usage examples, in which horizontal scaling is required, not limited to a single instance. |
Performance |
Performance mainly depends on the disk subsystem. For maximum performance, query, index, and table structure optimization is often required. |
Performance usually depends on the cluster size of the underlying hardware, network latency and the calling application. |
Scaling |
Relational databases are usually scaled by increasing the computational capabilities of the hardware or adding separate copies for reading workloads. |
NoSQL databases typically support high separability due to access patterns based on key-value pairs with scalability based on a distributed architecture. This increases throughput and provides consistent performance on an almost unlimited scale. |
API |
Requests to write and extract data are written in SQL. These queries are analyzed and executed by the relational database. |
Object-oriented APIs allow application developers to easily write and retrieve data structures stored in memory. Through the use of section keys, applications can search by key-value pairs, column sets, or partially structured documents containing serial objects and application attributes. |