What is Azure Blob?

Written by Liat Sokolov, Product Manager | Jul 16, 2019 6:04:54 AM

Cloud-based object storage offers a less-expensive option in the cloud for storing large quantities of unstructured data. Compared to traditional storage, where data is stored as blocks, object storage in the cloud stores data as objects, tagging them with identifying metadata. Azure Blob is the standard cloud object storage service from Azure: now thanks to NetApp Cloud Tiering, Azure Blob can serve as an alternative or complement the on-premises AFF storage systems.

With Cloud Tiering in AFF, NetApp brings the best of both worlds together by enabling customers to use low cost Azure Blob storage as a capacity tier for AFF storage on-premises. This blog will give you an introduction to the features of Azure Blob storage and show how it fits into the innovative hybrid cloud storage solution offered to users by NetApp AFF via Cloud Tiering.

What is Azure Blob?

Unstructured data can be broadly classified as data that does not fall into specific constructs such as text or binary data. Azure Blob storage is designed to store large amounts of unstructured data that can be accessed by applications hosted on-premises as well as in the cloud over HTTP/HTTPs protocols. Data stored in Azure Blob storage can be accessed by making RESTful API calls or by connecting to the storage using Azure PowerShell or CLI. You can integrate Azure storage client libraries with applications written in almost all of the popular development platforms, including .Net, Java, Python, Ruby, Node.Js, etc.

Common use cases for Azure Blob storage are:

Storing large video and audio files to be used by streaming applications.
Uploading files for distributed access architecture or to be served directly by the browser.
Centralized storage of application log and configuration files.
Big data and analytics storage.
Offsite storage of backup and archival files at lower cost.

A blob resides in a logical construct called a container inside an Azure Storage account. Redundancy of the data in Azure Blob is ensured through replication of data within the same Azure data center, paired regions, or multiple zones. Ingress and egress provided by the blob depends on the selected replication type. When data is replicated across multiple zones in Azure (which is known as Zone-redundant storage) or within the same data center (which is called locally-redundant storage), Azure Blob could give a maximum ingress of 20 Gbps. Azure Blob supports a maximum capacity of 2 PB in Azure’s US and Europe regions, while other regions support a maximum capacity of 500 TB.

Storage Tiers in Azure

One of the biggest triggers for a shift to cloud storage adoption is exponential data growth. The lifecycle of data stored in the cloud also depends on the use case for which it is leveraged. For example, backup and archival data stored in Azure Blobs could stay in the cloud for long durations without being accessed, whereas the data in blob storage used by a production application will be accessed and modified on a daily basis. In order to optimize the cost incurred by blob storage, Azure offers hot and cold storage tiers in Azure Blob, each of which targets different data based on usage.

Hot Tier: This tier, as the name indicates, is suitable for data that will be frequently accessed. Though still termed cold data in NetApp’s tiering terminology, it can be considered active data. The charges for data storage in this tier will be slightly higher ($0.0184 per GB for the first 50 TB) than Cool tier but the charges to access the data (read/write) will be lower. Hot tier is best suited for frequently accessed application data or data that needs to be temporarily stored before moving to the Cool Tier.

Cool Tier: Data that is not expected to be accessed for at least a minimum of 30 days should be moved to Azure Blob’s Cool tier. Data storage charges in this tier are very low ($0.01 per GB for the first 50 TB), but charges for read/write operations performed on the data will be higher when compared to Hot tier. You will also incur additional charges for data retrieval from the Cool Tier while data retrieval from Hot tier is free. This tier is best suited for infrequently accessed data that is being moved to the cloud to take advantage of cloud storage economy.

Cloud Tiering with NetApp AFF

NetApp AFF is one of the most popular all-flash storage solutions in use by organizations today. It uses the trusted NetApp ONTAP data management technology to provide an enterprise class storage solution. Cloud Tiering is a new feature for AFF systems that automatically detects infrequently accessed data in the on-premises storage system (a performance tier in effect) and moves it to Azure Blob storage (which as as a capacity tier) using NetApp’s FabricPool technology. It fits into the “Lift and Don’t Shift” strategy, where data is migrated seamlessly to cloud with no changes required to the applications accessing the data.

Tiering of data to cloud is done based on the volume tiering policy configured in ONTAP. There are two types of tiering policy:

Snapshot only: In this policy, Snapshot blocks associated with the file system are tiered to Azure Blob storage. Snapshots are tiered after a cooling period of two days and if the aggregate is at 50% capacity. If the snapshots are accessed and the data blocks become hot, data is tiered back to performance tier on-premises.

Auto: In this tiering policy, other cold data in the volume including the snapshots are moved to Azure Blob storage after a cooling period of 31 days. As with the snapshot-only policy, tiering will be initiated only when the aggregate reaches 50% capacity. This intelligent tiering policy distinguishes the sequential reads done by indexing and antivirus scans from other read types, keeping data accessed by these processes in cloud storage. Any other types of data read will heat up the data blocks in object storage, and cause them to be moved back to the AFF performance tier on-prem.

Migrating infrequently accessed data to the cloud helps organizations repurpose high performance AFF storage for more mission-critical applications, thereby offering an optimal balance of cost and performance. Cloud Tiering is helpful in multiple enterprise use cases, with value propositions for:

Initial phases of hybrid cloud architecture implementation.
Minimizing new upfront storage investments (CAPEX) and convert to monthly charges in the cloud (OPEX).
Leveraging the cloud without refactoring or rearchitecting applications.
Reducing the total cost of ownership of AFF storage in environments with large data growth rates and long-term retention requirements.
Management, configuration, and optimization from a single user interface.

Use Cloud Tiering with Azure Blobs

Netapp AFF cloud tiering to Azure can be enabled for storage systems running ONTAP 9.4 or later and with network connectivity to Azure Blob storage. You need to install a NetApp Service connector to an Azure VNET that can connect with the on-premises ONTAP cluster to identify active and inactive data and configure the tiering policies. The service uses NetApp FabricPool, a NetApp Data Fabric technology, in the backend to do the automated teiring of data based on the configured policy. While configuring the inactive data to be stored in Azure Blob, the Hot access tier is selected by default. Cool tier will soon be supported in a future release

Conclusion

It is estimated that NetApp’s Cloud Tiering feature can provide users with as much as 79% capacity savings in your AFF storage. While enabling optimal usage of storage, Cloud Tiering also provides single-pane management of multiple clusters, thereby eliminating administrative overheads. It also shows tiering health reports as well active and inactive data reports to give you a holistic view of your storage health across environments.

View full post