How Does Cloud Storage Work?
Cloud storage systems are typically made up of large numbers of distributed data centers or servers. The resources of these centers and servers are leased out to customers as reserved capacity or on-demand. When you store data in cloud resources, the service provider is responsible for ensuring that it is durable and available to users. This is done by replicating data across multiple servers or data centers.
Cloud Storage Technology
Although object storage is the most common cloud storage technology, there are other options. Generally, these include block storage (such as Amazon EBS) and file storage (such as Azure Files).
Block storage is a type of storage that uses abstraction to create volumes or blocks in a low-level storage device. These blocks serve as virtual hard drives that you can attach to instances or VMs to serve as persistent storage.
Learn more in the detailed guide to block storage
Cloud File Sharing
Cloud file sharing is a type of storage that holds data hierarchically using files and folders. File storage is the standard storage type used by on-premises machines and servers. Typically, you access data through the Network File System (NFS) protocol or the Server Message Block (SMB) protocol.
Distributed storage is not a type of storage so much as a storage infrastructure. It enables you to split storage across multiple workstations, servers, or data centers. Each storage device serves as a node in a storage cluster that you manage centrally. You can create distributed storage infrastructures for object, block, and file storage.
Distributed storage infrastructures are what provide the benefits of cloud storage, including:
- Scalability—you can scale storage horizontally by increasing the number of storage nodes in your cluster.
- Redundancy—you can store multiple copies of data in remote locations for greater availability and durability. Syncing enables you to ensure data is mirrored.
- Cost—you can use lower performance or commodity hardware that is linked together to provide the same storage volume as higher cost solutions.
- Performance—you can ensure that users can access data from nearby locations, reducing latency. You can also enable massively parallel access, splitting data retrieval across resources.
Benefits of Cloud Storage to Businesses
Switching to cloud storage can provide cost saving, availability, and data reliability benefits for businesses. Below are the most common benefits that you can expect to receive from cloud storage services.
Reduced Capital Expenses
Moving to public or hosted private cloud services reduces your need to purchase or maintain hardware. This means you are not responsible for the resources required to keep hardware up to date, to house, cool, or secure hardware, to monitor hardware, or to capacity plan. Since infrastructure is leased you also take on less technical debt and can redirect capital expenses to operational tasks.
Data Tiering for Cost Savings
Cloud storage services often offer multiple hardware or access tiers to choose from. This enables you to tier your data according to access priority and frequency. You can limit high performance storage resources to your most critical data and push lower priority data to lower performance tiers. This enables you to save costs without having to purchase or maintain additional hardware.
Data Redundancy and Replication
Cloud storage services typically include automatic data replication and redundancy features. These features distribute your data across servers, data centers, availability zones, or regions to ensure that data remains available. This redundancy helps protect you from hardware failures, natural disasters, and issues related to heavy traffic.
The remote accessibility of cloud storage enables you to work with distributed teams and users easily. You can access data stored in the cloud at any time from anywhere and on almost any device. Additionally, since cloud data access is centralized, it’s easier for IT teams to manage data without limiting accessibility.
In addition to the built-in data replication that most cloud storage services provide you can use cloud storage to enable reliable disaster recovery. Since cloud storage is typically remote you can use it to store failover systems or backups that will remain available even if your on-premises systems go down.
Easier File Upload
Traditionally, users would upload files directly to servers using protocols like FTP and SFTP. This was cumbersome, slow, and error prone. Modern cloud storage services provide seamless methods for uploading files, such as desktop agents that synchronize files directly to the cloud, and convenient web interfaces for uploading files.
Learn more in the detailed guide to file upload
Cloud Storage Challenges
When considering the benefits of cloud storage services, you should also be sure to consider the challenges you might face implementing these resources.
On an individual level cloud resources are easy to provision and consume. Most users can manage setting up a Google Drive or Dropbox account. However, configuring storage resources for an organization with multiple users, diverse business goals, and a variety of compliance obligations is much more complex.
When adopting cloud storage resources, you cannot simply move data and trust your IT team to do the rest. Instead, you need to plan your migration carefully and include staff with cloud experience and expertise.
Data Movement Challenges
Once you plan your migration, you may also run into challenges when moving data. The amount of data you need to migrate, the support for data formats, and the security of the transfer all play a role.
Additionally, you may need to shut down or enable syncing for some applications and components to ensure no data is lost. If you do not manage data carefully, you may not be able effectively use it after migration.
When you store data in the cloud you have less control over it than when stored on-premises. This is caused by several main issues:
- Data is more accessible to outsiders due to Internet connectivity.
- Monitoring is more difficult due to distribution of data.
- You are reliant on the cloud provider for infrastructure security.
In combination, these factors put your data at greater risk, particularly if you do not understand which aspects of security are your responsibility. Not properly implementing access controls or securing storage resources can provide malicious parties full access to your data and accounts.
Learn more in the detailed guide to cloud storage security
Public vs Private vs Hybrid Cloud Storage Services
When considering cloud storage, there several different service models you can choose from depending on your intended use, budget, and required level of control. Storage services can be grouped into two main categories—public and private. There are also subcategories you can choose from, defined as hybrid and multicloud.
Public Cloud Storage
Public cloud storage services are based on resources that are owned, maintained, and operated by the cloud storage provider. Google, Azure, and Amazon are the three largest public cloud storage providers.
Public cloud resources are designed for multiple tenants per server. Providers separate tenant or customer data through access controls, security policies, and data isolation practices. Some public cloud providers also offer dedicated servers for greater isolation. The resources you lease are made available over Internet connections and you can access data through web interfaces which generally use REST APIs.
Public cloud storage services can be reserved or used on-demand. Reserved resources may require advance payment or an agreement to retain services for a specific period of time. On-demand services are available as needed and are paid for on a monthly basis. Services used are typically charged according to the number of gigabytes used and the amount of bandwidth used for data access or transfer.
Hybrid Cloud Storage
Hybrid cloud storage is a combination of remote cloud storage (either public or private) and local resources. Storage is often implemented with proprietary software or appliances that sync with cloud resources via API. These infrastructures are typically used by organizations that want or need to keep data locally accessible. For example, due to mission-critical legacy applications or compliance restrictions.
When operating a hybrid infrastructure you can have separate data stored in cloud or local resources, or you can sync data. For example, you can use policy engines to transfer infrequently accessed data to the cloud while retaining frequently accessed data. Meanwhile, syncing data enables you to gain the cloud's availability while maintaining low latency provided by on-site resources.
Multicloud storage is a strategy that uses multiple types of cloud storage services (public and private) or services from different vendors. The purpose of multicloud strategies is to diversify your services to avoid vendor lock-in, optimize use according to resource capability, or to enable use of otherwise incompatible services or applications. Multicloud storage strategies can combine native services, supplier-integrated services, and marketplace services.
Private Cloud Storage
Private cloud storage services can be remote or on-premises. Services can be owned and operated in-house or by a cloud provider. These services provide dedicated, single tenant resources and allow greater control over your stored data.
Depending on the location of your private cloud services, you can base your infrastructure on your existing hardware, newly purchased hardware, or hardware leased from a provider. Also depending on location, you can access your stored data through private networks or through the Internet.
The cost for private cloud services depends on whether it is self operated or not. If yes, costs are based on necessary hardware purchases, ongoing hardware maintenance, and cloud infrastructure operation and management. If a vendor operates private clouds, you are typically charged based on the resources you are using and the level of support needed.
Public Cloud Storage: What Do the Big Three Providers Offer?
The following table summarizes storage services offered by the big three cloud providers.
Google Cloud Platform
Persistent Disk Storage
Azure Long-Term Storage
Nearline & Coldline
Azure Import/Export Service
Storage Transfer Service
Click the links above to read our in-depth guides about each of these services. Or read on to learn about each cloud provider’s storage services in brief.
AWS Storage Services
Amazon Web Services (AWS) is a top cloud provider, offering hundreds of cloud computing products and services, including storage. Popular and notable cloud services include AWS EBS, AWS EFS, Amazon S3, and Amazon S3 Glacier.
Amazon Elastic Block Store (EBS)
AWS EBS lets you store data as blocks on persistent volumes. EBS was designed to provide persistent storage for Amazon Elastic Compute Cloud (Amazon EC2) instances and is commonly used for this purpose.
You can use EBS for a range of purposes, including stream processing, relational databases like MySQL, NoSQL databases like Redis, data warehouses, and any workload that requires high throughput and low latency.
Read more in our detailed guide about AWS EBS
Amazon Elastic File System (EFS)
AWS EFS is a file system based in the cloud that can elastically scale to petabytes. You can mount EFS as a drive on EC2 instances and use to provide shared access to data. EFS is ideal for life and shift migrations, database backup, development and testing, and CMS. You can also leverage EFS as persistent storage for Kubernetes or Docker.
Read more in our detailed guide about AWS EFS
Amazon Simple Storage Service (Amazon S3) provides cloud-based object storage for all data types. S3 offers unlimited scalability and 99.999999999% durability, which you can leverage for many purposes. For example, you can use S3 for data backup and archives, storing user-generated data, and also as a data lake collecting big data.
Read more in the detailed guide about S3 Storage
Amazon S3 Glacier
Amazon S3 Glacier provides cold storage for Amazon S3, which is offered at relatively low costs. However, additional fees are charged for data retrieval. S3 Glacier is typically used for archives and long-term backup.
Additional Storage-Related Services
Here are several useful services to consider when using AWS cloud storage:
- AWS backup—a fully-managed AWS service that helps you centralize all backup activities, including AWS resources and on-premise backup.
- AWS Storage Gateway—helps you integrate on-premise storage with AWS-based cloud storage.
- AWS FSx Lustre—a cloud-based file system designed and optimized for high-performance computing (HPC) and compute-intensive workloads.
AWS FSx for Windows File Server—a Microsoft Windows file system, designed to help you easily move Windows-based applications to AWS cloud environments.
Azure Storage Services
Microsoft Azure offers a wide range of cloud computing services, designed to provide scalability for enterprises and enable hybrid infrastructure implementations. Notable and popular Azure storage services include Blob Storage, Azure Files, Azure Queue Storage, Table Storage, and Azure Managed Disk.
Learn more in our detailed guide to Azure Storage services
Blob Storage lets you store massive amounts of unstructured data as objects in the Azure cloud. You can organize your files in containers, which can be accessed by end-users and applications via HTTP connections.
When you use Blob Storage, there is no need for publishing and access-policy tuning. Additionally, there is no limit to the number of blobs or containers you can use, provided you stay within the storage limits of your account.
Azure Files offers fully-manage file storage based in the Azure cloud. It is accessible through the Server Message Block (SMB) protocol, storage client libraries, and REST APIs. Azure Files is typically used to achieve high availability for network file shares. You can, for example, use this service to provide multiple VMs with read and write access to the same files.
Learn more in our detailed guides to:
Azure Queue Storage
Azure Queue Storage lets you store messages and exchange data between components in the cloud or on-premises. You can store messages communicated asynchronously and share them between independent application components, using HTTP or HTTPS.
You can use Queue Storage for a variety of cases. For example, you can use this services for messages communicated between Azure Web and Worker roles, as well as for processing backlog messages.
Azure Table is a NoSQL data pool that lets you store large amounts of non-relational structured data. Table allows you to insert and retrieve data through authenticated API calls or client libraries. The source can be located outside or inside the Azure cloud.
You can use Azure Table to store flexible datasets at scale, without establishing relationships between the sets. The service lets you organize data in tables. Each table can hold collections of row entities and their properties. Common use cases include storing customer information and diagnostic logs.
Learn more in our detailed guide to Azure Table Storage
Azure Managed Disk
Azure Managed Disk provides a fully-managed block-level volumes service. A managed disk works as a virtualized physical disk, which operates in the Azure cloud and used with Azure virtual machines (VMs). A managed disk is designed as a page blob, which acts like a random storage object. You can define the size and type of the disk and then provision the disk.
Google Cloud Storage Services
Google Cloud offers many services, designed for a wide range of industries. Popular and commonly used Google Cloud storage services include Cloud Storage, Google Cloud Persistent Disks, and Google Cloud Filestore.
Learn more in our detailed guide to Google Cloud Storage
Google Cloud Storage is an object storage service offering high durability, low latency, and geo-redundancy when storing data in multi or dual regions. Google Cloud offers many useful services, such as fine-frain permissions, object versioning, and retention policies.
You can use Google Cloud Storage for many scenarios, including backups, archives, and ML-based analysis. Google Cloud serves as the underlining storage of many services.
Google Cloud Persistent Disks
Persistent Disk offers highly durable block storage for VM instances. A persistent disk works like a storage device, which instances can access just like a physical disk. You attach persistent disks to instances running in Compute Engine - Google’s VM service - and Google Kubernetes Engine (GKE) - Google’s fully-managed Kubernetes service.
Learn more in our detailed guide to GCP Persistent Disks
Google Cloud Filestore
Filestore is Google’s fully-managed file storage service. You can use Filestore to create shared file systems for your applications. Filestore provides network attached storage (NAS) for your Compute Engine and GKE instances. It works in the cloud, on-premises, and in hybrid scenarios.
Learn more in our detailed guide to Google Cloud Firestore
Cloud Storage vs Cloud Backup
Although both use cloud resources to store data, cloud storage and cloud backup are not interchangeable terms. To clarify these terms, remember the following:
- Cloud storage—often used to supplement or replace local storage. Cloud storage can be used to store active, infrequently accessed, or archived data. You can also use it to store backups of cloud or on-premises resources. It enables you to provide distributed, remote access to data with centralized management.
- Cloud backup—can refer to the process of duplicating recovery to the cloud, the actual backups themselves, or the service that is used to store backup data. Cloud backups are used to provide redundancy for data and ensure that a copy remains accessible even if the original is damaged.
Below we describe backup services offered by the three leading public cloud providers. You can use these services to create and manage backups of your data.
AWS Backup is a service that you can use to create backups of EBS, EFS, DynamoDB, and RDS services. It also includes an integration with AWS Storage Gateway that enables you to create backups of on-premises data. This service enables you to backup most AWS data when combined with the native snapshot capabilities that are included in many of AWS’s other services.
Through the AWS Backup console, you can manage backups from across your services. This includes determining which storage services you store backups in, who has access to backup data, and how long you retain backups for.
Another way to backup data on AWS is using AWS snapshots, also known as EBS snapshots. Elastic Block Storage (EBS) volumes serve as virtual hard disks that can be attached to EC2 instances. A snapshot creates a point-in-time backup of an EBS volume, storing it in Amazon S3.
Azure Backup is a solution you can use to backup Azure service or on-premises data. It enables you to automate and manage backups and their life cycles. This service also integrates with Recovery Services vaults, storage resources designed specifically for backup data.
Azure Backup and Recovery Services are part of a collection of services Azure offers for backup creation and management. The other main service is Azure Site Recovery. This service enables you to create and remotely store backups of your data and services which you can then use for disaster recovery or as failover services.
Google Cloud Backup
Unlike AWS or Azure, Google Cloud does not offer a specific service for backup creation or management. Instead, it enables you to store backups in lower tier (i.e., cheaper) storage services. Your primary storage options include:
- Nearline Storage—designed for data that is accessed once a month or less. This option is best suited to your most recent backup or partial backups.
- Coldline Storage—designed for data that is accessed once a year or less. This option is best suited to disaster recovery backups or archived backup data.
These are reasonable options in Google Cloud but are not as functional with other cloud providers. The difference is that Google’s services provide access with sub-millisecond latency while cold and archive storage in AWS or Azure can take several hours or days to retrieve.
Cloud Storage vs Cloud Database
Cloud storage enables you to store unstructured data or files, while cloud databases enable you to store structured data. You can store this data in tables, for relational databases, or in other formats like key-value pairs, for NoSQL databases.
Cloud databases rely on cloud storage. In some cases the storage is abstracted from the user and packaged in the database solution. In other cases, the user has to deploy their own storage and connect the database to that storage.
All three major cloud providers offer database services that are based on their own resources. Some also enable you to host data or workloads in hybrid or on-premises resources. Below you can see a collection of the various database services each provider offers.
AWS Cloud Database Services
AWS database services include:
- Amazon RDS—a relational database service that supports your choice of six engines, including SQL Server, MariaDB, Oracle, PostgreSQL, MySQL, and Amazon Aurora. This is a fully managed service that you can manage through CLI, API, or the Console.
- Amazon Aurora—a proprietary relational database that offers PostgreSQL and MySQL compatibility modes. This database is designed for high performance and integration with AWS services.
- Amazon DynamoDB—a document and key-value database designed for low latency, affordable pricing, high performance, and durability. This is a fully managed service for production level workloads.
- Amazon ElastiCache—an in-memory data store that you can use in place of a traditional database. It is compatible with Redis and Memcached and provides scalability, high performance, and super low latency.
- Amazon Neptune—a fully managed graph database that is optimized for high speed querying. It supports RDF and Property Graph data models and SPARQL and Gremlin languages.
- Amazon Timestream—a fully managed time series database designed for analytics, DevOps, and IoT workloads. It enables you to stream data and perform time sensitive queries.
- Amazon Quantum Ledger Database—a fully managed ledger database that you can use to verify transactions cryptographically. Stored data is transparent and immutable, making it ideal for auditing and financial transactions.
Azure Cloud Database Services
Azure database services include:
- Azure Cosmos DB—a fully managed, multi model database. It enables you to define schema and indexes from your workloads and application for maximum flexibility. CosmosDB includes API support for Table, SQL, MongoDB, Gremlin, Cassandra, Spark, and ectd.
- Azure SQL Database—a fully managed database based on the SQL Server engine. It is highly available, scalable, and includes serverless options. You also have the option of bringing existing SQL Server licenses from on-premises.
- Azure Database for MySQL—a fully managed database service based on the MySQL Community edition. You can integrate it with Azure Kubernetes Service and Azure App Service.
- Azure Database for PostgreSQL—a fully managed database for PostgreSQL Hyperscale. You can use this service in the cloud or on-premises. It is extensible through plugins, including for Azure Data Studio, Timescale DB, Visual Studio Code, and PostGIS.
- SQL Server on Virtual Machines—a service that enables you to host SQL server on VMs with hybrid connectivity. You can use this database with Windows or Linux and use it to extend support for SQL Server 2008.
- Azure Synapse Analytics—an analytics service that combines big data analytics with data warehousing. It is integrated with SQL and Apache Spark engines and can be integrated with CosmosDB.
- Azure Data Explorer—a data analytics service that you can use to perform real-time analytics on streaming data. You can use it to perform time series analyses and query big data.
- Azure Cache for Redis—a fully managed, in-memory data store that supports Redis workloads. It includes features for built-in security, scalability, and reliability.
- Azure Database for MariaDB—a fully managed database based on the community version of MariaDB. It supports a variety of open source frameworks and includes features for high availability, scalability, and built-in security.
Google Cloud Database Services
Google Cloud database services include:
- Cloud SQL—a fully managed database that provides support for SQL Server, PostgreSQL, and MySQL workloads. It includes features for automated backups, data replication, and failover.
- Cloud Spanner—a fully managed relational database that is ACID compliant, globally distributed and supports automatic sharding. It also offers multi-regional availability and transparent synchronous replication.
- BigQuery—a scalable, serverless, multi-cloud data warehouse. You can use it to perform predictive and real-time analytics with built-in machine learning.
- Cloud Bigtable—a fully managed NoSQL database designed for operational and analytical workloads. It is cluster based and can scale to hundreds of nodes with built-in replication and high availability.
- Cloud Firestore—a fully managed document database that you can use to store, sync, and query application data. It includes client libraries for offline support and live synchronization and integrates with Firebase.
- Firebase—a NoSQL database that provides real time data synchronization. You can use it to collaborate globally, support serverless applications, and provide offline support.
- Cloud Memorystore—a fully managed in-memory data store that supports Memcached and Redis. You can use it to migrate caching layers and create application caches. It includes features for automatic failover, monitoring, patching, and high availability.
Cloud Storage with NetApp CVO
NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.
See Additional Guides on Key Cloud Storage Topics
NetApp, together with several partner websites, has authored a large repository of content that can help you learn about many aspects of cloud storage. Check out the articles below for objective, concise reviews of key cloud storage topics.
Cloud File Sharing
Authored by NetApp
File shares support some of the most important workloads that enterprise businesses rely on, and the resources of the public cloud have created interesting new possibilities. Every major public cloud provider now offers its own cloud file sharing service, each with its own target workloads and considerations. But not every enterprise will find what they’re looking for in a fully managed, all-cloud service.
See top articles in our cloud file sharing guide:
Authored by NetApp
Multicloud strategies are becoming more popular as organizations seek to optimize their cloud services and deployments. These strategies can help you prevent vendor lock-in, increase your flexibility, and help you optimize costs.
This guide explains what multicloud storage is, how it works, what it’s used for, the core requirements for this storage, and how Cloud Volumes ONTAP supports it.
See top articles in our multicloud storage guide:
AWS Database Services
Authored by NetApp
AWS offers a range of database services and support to try and meet all its clients needs. Many of these services are fully managed to help reduce your IT workload and enable you to store and use data as simply as possible.
This guide explains what AWS database support is available, what database services are available, and how you can migrate your databases to AWS.
See top articles in our AWS database services guide:
AWS Snapshots for Amazon EBS
Authored by NetApp
Snapshots are a common method for natively backing up cloud data and services. This method enables you to save point in time backups which can be restored when needed.
This guide explains what types of storage snapshots are available, what AWS snapshots are, and how to use AWS snapshots.
See top articles in our AWS snapshots guide:
Authored by Perception Point
Authored by Atlantic
Azure File Storage
Authored by NetApp
Storing file data in Azure is simple through Azure File Storage service. This service enables you to store files across cloud and on-premises resources, enabling you to flexibly and securely share data and workflows.
This guide explains what Azure File Storage is, common use cases for Files, management concepts and components of the service, how data is accessed and the architecture of the service, and some best practices for securing your data.
See top articles in our Azure file storage guide:
Authored by NetApp
Azure Files is one of several storage services available to users in Azure. It is a service designed to replicate file shares like those commonly used on premises. With this service, you can smoothly transition your files to the cloud and allow file sharing across your teams.
This guide explains what Azure Files is, how it complements other storage services, pricing and use cases for Files, and pros and cons you should be aware of.
See top articles in our Azure Files guide:
Azure Database Services
Authored by NetApp
Nearly every production cloud deployment has one or more databases. These tools provide support for applications, enable workloads, and organize your data meaningfully. Having databases available that support all your needs is essential and Azure offers a range to choose from.
This guide explains what Azure database workloads are supported, how databases work in Azure, and what services are available.
See top articles in our Azure database guide:
Google Cloud Storage
Authored by NetApp
Google Cloud offers a variety of storage options for you to choose from. These services form the base of many other services in the cloud and understanding what your options are can help you manage your cloud more efficiently.
This guide explains what Google Cloud Storage options exist and their common uses.
See top articles in our Google Cloud storage guide:
Google Cloud Database Services
Authored by NetApp
Google Cloud’s specialty is flexibility and integration of services and this extends to its database services. In Google Cloud you have a wide variety of database deployments, models, and support to choose from.
This guide explains your options for deploying databases in the cloud, what Google Cloud database services are available, and how to choose the right service for you.
See top articles in our Google Cloud database guide:
Authored by NetApp
Software developers and DevOps engineers are packaging applications into lightweight units called containers. Kubernetes helps manage and scale containers across clusters of physical machines.
In this environment, Kubernetes storage becomes a significant challenge. By default, containers are ephemeral, meaning that any transient data on the container is lost when it shuts down. However, Kubernetes provides several options for persistent storage.
See top articles in our Kubernetes guide:
File Upload and Sharing Technologies
Authored by Cloudinary
File uploads are a common method of collecting file data from users and creating interactivity in services. For example, file uploads are used to enable users to edit their own images or submit documents for translation.
This guide explains what file uploads are, covers the most common types of file upload methods, and explains how you can use Cloudinary to upload files through a variety of languages and frameworks.
See top articles in our file upload guide:
Authored by NetApp
Learn the basics of storing data in Amazon Simple Storage Service (S3), Amazon’s first cloud service and still one of its most popular.
Authored by NetApp
Learn about storage solutions in the Microsoft Azure cloud, including object storage, block storage, and file storage solutions.
Additional Cloud Storage Resources
See these additional articles from our content partners to learn more about cloud storage topics.