NoSQL databases enable you to store data with flexible schema and a variety of data models. These databases are relatively easy for developers to use, and have the high performance and functionality needed for modern applications. NoSQL AWS databases can hold large volumes of data while still providing low latency.
As part of AWS database offerings, there are six types of NoSQL databases you can select from along with a variety of managed and self-managed database services. These database services are designed to support your cloud-native workloads and smoothly integrate with existing AWS resources.
In this article, you will learn:
NoSQL is a term originally coined by Carlo Strozzi in 1998 to refer to an open-source relational database that did not use SQL. Then, in 2009, the term was used again to refer in general to non-relational databases. This term can stand for either “no SQL” or “not only SQL” depending on the construction of the database to which it’s applied.
Related content: learn more in our article about NoSQL vs SQL
The development of NoSQL databases stems from the growth of web data, which created a need for faster processing and processing of unstructured data. These systems can be built on distributed architecture, allowing for scalability and processing near the source of data or user, meaning greater speed. This was especially important for the growth of big data and led to the adoption of NoSQL systems by many tech companies including Google, Facebook, and Twitter.
There are six types of NoSQL database models you can choose from in AWS.
Key-value databases enable you to store data in pairs containing a unique ID and a data value. This provides a flexible storage structure since values are not assigned to a table and can hold any amount or structure of data. These databases can manage large volumes of data or requests. Use cases for key-value databases include gaming applications, eCommerce systems, and high traffic applications.
AWS service: Amazon DynamoDB
Document databases are structured similarly to key-value databases except that keys and values are stored in documents written in a markup language like JSON, XML, or YAML. You can use these databases to store hierarchies of data by linking documents. Use cases for document databases include user profiles, catalogs, and content management.
AWS service: Amazon DocumentDB, DynamoDB
Wide column databases are based on tables but without a strict column format. Rows do not need a value in every column and segments of rows and columns containing different data formats can be combined. Use cases for wide column databases include route optimization, fleet management, and industrial maintenance applications.
AWS service: Amazon Keyspaces (for Apache Cassandra)
Graph databases are structured as collections of edges and nodes. Nodes are the individual data values and edges are the relationships between those values. These databases enable you to track intricately related data in an organic network rather than a structured table. Use cases for graph databases include recommendation engines, social networking, and fraud detection.
AWS service: Amazon Neptune
Time series databases store data in time ordered streams. Data is not sorted by value or ID but by the time of collection, ingestion, or other timestamps included in the metadata. These databases enable you to manage and query data based on time intervals. Use cases for time series databases include industrial telemetry, DevOps, and Internet of things (IoT) applications.
AWS service: Amazon Timestream
Ledger databases are based on logs that record events related to data values. These logs are transparent, immutable, and can be verified cryptographically to prove the authenticity and integrity of data. Use cases for ledger databases include banking systems, registrations, supply chains, and systems of record.
AWS service: Amazon Quantum Ledger Database (QLDB)
You may be able to choose a database service based solely on the type of database you need. However, it is helpful to understand the features of the services AWS offers before making your choice. If these services do not provide the features or capabilities you need, you can look for third-party options instead.
Related content: read our guide to AWS database as a service
Amazon DynamoDB is a document and key-value database. It is a fully managed service that includes features for backup and restore, in-memory caching, security, and multiregion, multimaster distribution. DynamoDB supports atomicity, consistency, isolation, durability (ACID) transactions and encryption by default.
Related content: read our guide to DynamoDB pricing
Amazon ElastiCache is an in-memory data store that you can use in place of a disk-based database. It provides fully managed support for Memcached and Redis, and enables scaling with memory sharding. It is designed to support sub-millisecond response times and is typically used for queuing, real-time analytics, caching, and session stores.
Amazon Neptune is a graph database service that is fully managed and optimized for storing data on billions of relationships. It supports a range of graph models and query languages, including W3C’s RDF, Property Graph, SPARQL, and TinkerPop Gremlin.
Neptune includes features for point-in-time recovery, multi-zone data replication, continuous backups, and read replicas. It supports ACID transactions and provides encryption in-transit and at-rest.
Amazon Timestream is a fully managed time series database with an adaptive query processing engine. It is a serverless service and automatically manages hardware and software maintenance and provisioning for you.
Timestream includes features for automated data compression, tiering, retention, and rollups. It also includes built-in analytics for the approximation, smoothing, and interpolation of data.
Amazon QLDB is a ledger database that you can use to track data changes. It is fully managed and designed to enable you to avoid complex setups required for managing ledger data with relational databases or blockchain.
QLDB provides a SQL-like API, full transactional support, and a flexible document data model. It includes features for automatic scaling, ACID compliant transactions, multizone availability, and data streaming with Kinesis Data Streams.
Amazon DocumentDB is a fully managed document database that is compatible with MongoDB. DocumentDB architecture separates compute and storage resources for greater scalability and flexibility. It also includes support for up to 15 read replicas, data replication for durability across three availability zones, and free use of the AWS Database Migration Service.
Amazon Keyspaces is a managed wide column database that is compatible with Apache Cassandra. You can use it to migrate Cassandra workloads and applications and continue to use Cassandra native code and tools. It includes features for autoscaling and enables you to select between on-demand or provisioned resources.
NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.
In particular, Cloud Volumes ONTAP helps in addressing database workloads challenges in the cloud, and filling the gap between your cloud-based database capabilities and the public cloud resources it runs on.
Cloud Volumes ONTAP supports advanced features for managing SAN storage in the cloud, catering for NoSQL database systems, as well as NFS shares that can be accessed directly from cloud big data analytics clusters.
In addition, the built-in storage efficiency features have a direct impact on costs for NoSQL in cloud deployments. The data protection and flexibility provided by features such as snapshots and data cloning give NoSQL database administrators and big data engineers the power to manage large volumes of data effectively.