BlueXP Blog

Azure Storage Options for Backup and Archive Data

Written by Gali Kovacs | Oct 7, 2020 4:27:00 AM

Not all data needs to be stored in the highest performing storage format. In general, that means any data that is not regularly accessed raw data, network telemetry dumps, and Azure backup data, archival data, etc., that has to be retained indefinitely. There are cost-effective Azure storage options specifically designed to be used to retain this kind of infrequently accessed data.

While the most critical enterprise workloads run on Azure premium storage, investing in costly on-premises storage systems or high performance cloud storage for that kind of data is not economical in the long run. That’s where Azure cold storage and archive storage options come in, which can help lower your cloud bill, specifically when it comes to yourAzure backup and archive costs.

In this post we will review the most cost-effective Azure storage options for backup and archive data, which will help reduce your overall TCO. We’ll profile the different features of Azure Blob storage’s cool and archive tiers and show you why you should incorporate them into your IT infra-architecture as a solution for low-cost data storage. We’ll also look at how Cloud Volumes ONTAP can help you further save on Azure.

Use the links below to jump down to the sections you want to see:

Azure Cool Storage Tier

Before we discuss the Azure cold storage option, a bit of background. Azure cloud storage can now be created under three categories: General-purpose v1 (GPv1), General-purpose v2 (GPv2) and Blob storage.

GPv1 supports all Azure storage options such as blobs, files, queues, and tables. GPv1 is now considered a legacy account type, and Azure recommends using GPv2 where possible, except for:

  • Applications that require the Azure classic deployment model.
  • Applications that don't require large capacity but are transaction-intensive or use significant geo-replication bandwidth, as GPv2 may increase storage costs. 
  • Applications that use a client library version lower than 4.x storage services and REST API version earlier than 2014-02-14.

Data tiering into hot and cold tiers is supported within Azure Blob storage classes. Because General-Purpose v2 storage supports the functionality of blob storage as well as GPv1, it also supports tiered storage.

Azure’s cool access tier, also known as Azure cool Blob storage, is for infrequently-accessed data that needs to be stored for a minimum of 30 days. Typical use cases include backing up data before tiering to archival systems, legal data, media files, system audit information, datasets used for big data analysis and more.

The storage cost for this Azure cold storage tier is lower than that of the hot storage tier. Since it is expected that the data stored in this tier will be accessed less frequently, the data access charges are high when compared to hot tier. There are no additional changes required in your applications as these tiers can be accessed using APIs in the same manner that you access Azure storage.

When it comes to Azure storage performance, a single blob on both the cool tier and hot tier has a rate of 60 Mib per second or up to 500 requests per second. The cool tier availability is 99%, which is slightly less than the 99.9% available for hot tier. For Read-only Geo-redundant Storage (RA-GRS), the cool tier availability is 99.9%, while the hot tier is 99.99%. These differences are mainly due to the nature of the data being stored in Azure cold storage tier, where the data will be less affected by the availability targets.

Azure cool tier is equivalent to the Amazon S3 Infrequent Access (S3-IA) storage in AWS that provides a low cost high performance storage for infrequently accessed data.

Azure Archive Storage Tier

The Azure archive access tier, as the name indicates, is a low-cost storage for archival data that will be rarely accessed. This tier is meant for data that will remain in archival storage for a minimum of 180 days. Long-term backup data, medical history, business transactions logs for audit and compliance, call center recordings are just some use cases that could benefit from the lengthy retention period on Azure archive storage.

Even though Azure archive storage offers the lowest cost in terms of data storage, its data retrieval charges are higher than that of hot and cool tiers. In fact, the archive tier data remains offline until the tier of the data is changed using a process called hydration. The process of hydrating data in the archive storage tier and moving it to either hot or cool tier could take up to 15 hours with standard priority and 1 hour for objects under 10 GB with high priority, hence, it is only intended for data that can afford that kind of access delay.

Azure also does not provide any availability SLAs for archive tier. The resiliency of the storage format is assured by replication at storage level. It supports Locally-redundant storage (LRS), Geo-Redundant storage (GRS) and RA-GRS.

The counterpart of Azure Storage Archive tier in AWS is Amazon S3 Glacier, which is a long-term, secure and durable object storage.

Cool Access Tier vs. Archive Access Tier

The cool access tier is for the kind of data you want to store on a short-term basis but may not access on a day-to-day basis, such as for banking transaction history, e-commerce transaction history, etc. You need the data to be online, should there be a requirement to retrieve it. The speed at which the data can be accessed if a requirement arises is almost the same as that of the hot storage tier.

One common use case for the cool storage tier is short-term data backup which could, at later point, be moved to a hot tier for restore or eventually moved to an archive tier for long-term storage. It can also be used for storing analytics data, large media files, telemetry, etc. The nature of the data is usually such that they would need a high capacity but low-cost storage option. Cool tier fits the bill in such cases.

Archive tier targets data that can be stored offline on a long-term basis. There will not be an immediate or urgent requirement to access the data. If a scenario arises where the data should be accessed, a wait time of up to 15 hours should be acceptable. A possible example is retrieval of medical records from an archive for research purposes. Unlike the cool tier, the archive tier is not enabled at the Azure Storage level. However, it supports blob-level tiering where the archival tier can be set at object level.

General availability of Archive tier was announced in December 2017 and the service is not yet available in all Azure regions. One common use case for archive tier is long-term data retention for compliance. Low-cost storage for data sets such as patient records in hospitals, offsite Azure blob backup and raw media files are other possible scenarios where archive tier can be leveraged.

In situations where the data access patterns are unclear, it is recommended to start with hot storage tier, monitor for a while, and then move the data to cool storage or archive storage tier based on access frequency.

The Cost-Benefit Analysis

The Archive tier offers the lowest Azure storage costs available on the Azure cloud: depending on the region, rates can be as low as $0.00099 to $0.002 per GB up to first 50 TBs. However, reading data from an archive tier can be a costly activity which charges $5 for every 10,000 read operations. An early deletion charge will be applicable effective March 1 2018 for any data deleted before the minimum period of 180 days for archive tier.  So if your data was stored for less than 180 days, the charge would cost the remaining days. For example if the data was stored for 130 days, the charge would be for 50 days of storage.

Cool tier data storage costs $0.01 GB up to 50 TB, which is higher than that of archive tier. However, there are no high charges associated with data read operations. 10,000 operations will cost only $0.01 in the cool tier.

As it can be seen from the associated costs listed below, selecting the optimal low-cost storage depends mainly on the size of data and the usage pattern. Terabytes of data can be stored at very low cost in the Archive tier. However, if a requirement arises where the data will be read more frequently, it is always economical to rehydrate and store that data in the cool tier or hot tier.

Integration with Existing Storage Systems

Both the cool and archive tiers can integrate with several industry leading storage service vendors to provide hybrid cloud storage offerings. The frequently-accessed data will be stored in on-premises storage or to the hot tier and the lesser frequently accessed data is tiered to cool or archive tier in Azure. It could initially be in cool tier, and at later point can be moved to archive tier based on access patterns.

The native offering from Microsoft for this is the StorSimple solution that supports cool tier. Different flavors of the solution such as StorSimple physical array, Virtual Array and cloud array have this integration available out of the box. But with the Virtual Array nearing the December 2020 End Of Life, and StorSimple 8000 series reaching EOL on 31/12/2022, StorSimple, though still advertised, is no longer an option.

How to Get Even Lower Storage Costs with Cloud Volumes ONTAP

NetApp offers out-of-the-box cloud storage integration with Cloud Volumes ONTAP for Azure storage. Cloud Volumes ONTAP interoperates with Azure cloud storage to cater to multiple use cases such as enterprise workload hosting, disaster recovery with Azure Site Recovery, DevOps and as a file share service.

Depending on the workloads to be supported, customers have the option to choose from different storage types available in Azure, such as HDD or SSD-backed storage. In either case, Cloud Volumes ONTAP helps with Azure cost optimization, lowering Azure storage costs up to 70% with space-saving storage efficiencies and data tiering to Azure blob storage.

NetApp’s hybrid cloud solution helps enterprise customers reap the benefits of both on-premises as well as cloud storage features of cool and archive Azure storage. To see how using Cloud Volumes ONTAP can cut down your TCO on Azure, check out our free Azure calculator.

Summary

Many organizations are addressing the exponential data growth and backup data storage challenge by leveraging cloud storage or hybrid cloud storage solutions. Cloud storage is a cheaper option when compared to on-premises data storage costs. Further cost reductions can be achieved by leveraging tiered low-cost Azure storage options for least-accessed data.

Azure hot and cold tier fits are inexpensive Azure cloud storage variants that fit into this space, space that NetApp’s Cloud Volumes ONTAP can further help in cost reduction with its range of storage-efficiency features. It is highly recommended that organizations include them in their environment storage architecture.