BlueXP Blog

Azure Data Lake Pricing Explained

Written by Yifat Perry, Technical Content Manager | Nov 18, 2021 9:16:00 AM

How Is Azure Data Lake Priced?

Azure Data Lake Storage Gen2 is an Azure big data solution that lets you run large-scale analytics on top of Azure Blob Storage. Its pricing model is tied closely to Azure Blob Storage pricing. Azure Data Lake pricing models include on-demand, pay-as-you-go rates as well as monthly commitment packages that offer up to 74% off the pay-as-you-go price.

Azure Data Lake Gen2 capabilities offer file system semantics, scale, and file-level security, as well as Azure Blob Storage features such as cost-effective tiered storage, high availability, and disaster recovery. Certain aspects like hierarchical namespaces metadata, the Cool and Archive storage tiers, and analytics operations measured in Analytics Units (AU), are subject to additional fees.

In this article:

Azure Data Lake Storage Pricing Components

Because Data Lake Storage Gen2 is built on top of Azure Blob Storage, it can offer cost-effective prices. The service also provides additional features—such as cool and archive early deletion—to help customers optimize the total cost of ownership for big data analytics workloads running on Azure.

Related content: Learn more in our detailed guide to Azure Data Lake

Azure Data Lake Storage lets you organize data in two distinct ways:

  • Hierarchical namespaces—lets you organize your data lake into structured files, folders, and directories.
  • Flat namespaces—lets you operate your data lake as an unstructured blob store.

The two namespace options are charged at the same storage rate. However, hierarchical namespaces are subject to additional fees for meta-data associated with the directory and folder structure.

Data Storage Costs

Here is a pricing example for the Premium Storage tier, with hierarchical namespaces, LRS redundancy, within the East US Region:

  • Hot—$0.15 / GB / month flat fee
  • Cool—$0.0208 / GB / month for first 50 TB, down to $0.0152 / GB / month for over 500 TB
  • Archive—$0.0152 / GB / month flat fee

Storage Capacity Reservation Options

You can reserve storage capacity for a term of 1 or 3 years, to receive significant discounts on storage costs. Below are the costs for a 3 year commitment, which grants the maximal discount:

  • Hot storage reserved for 3 years—monthly charge of $1,406 for 100 TB or $13,523 for 1 PB
  • Cool storage reserved for 3 years—monthly charge of $1,028 for 100 TB or $9,882 for 1 PB
  • Archive storage reserved for 3 years—monthly charge of $84 for 100 GB or $810 for 1 PB

Transaction And Data Retrieval Costs

There are additional costs according to the number of operations performed on the data:

  • Write operations—cost of $0.228 per 10,000 operations for the Premium Tier, down to $0.13 per 10,000 operations for the Archive Tier (write operations are cheaper for the cold storage tiers).
  • Data writing / GB—free for all tiers.
  • Read operations—cost of $0.00182 per 10,000 operations for the Premium Tier, up to $6.5 per 10,000 operations for the Archive Tier (read operations are more expensive for the cold storage tiers).
  • Data retrieval / GB—free for Premium and Hot tiers, costs $0.01 / GB for Cool tier and $0.02 for Archive tier.
  • Query acceleration:
    • Built-in for the Premium tier, and not available for Archive.
    • Cost per data scanned—$0.002 / GB for Hot tier and $0.002 for Cold tier.
    • Cost per data returned—$0.0007 / GB for Hot tier and $0.01 for Cold tier.

Special Costs for Cool and Archive Tiers

Any blob you move to either the Cool or Archive tier is subject to an early deletion period, as follows:

  • Archive tier—the early deletion period is 180 days. This is relevant for all storage.
  • Cool tier—the early deletion period is 30 days. This is relevant for general-purpose v2 storage accounts.

Early deletion charges are prorated. If, for example, you move a blob to the Archive tier and after 45 days you delete or move it to the hot tier, you are charged an early deletion fee for a total of 135 days (180—45 = 135) of Archive storage.

Azure Data Lake Analytics Pricing

Azure Data Lake Analytics provides an on-demand analytics job service designed to help simplify big data. There is no need to deploy, configure, or tune hardware or software. Instead, you can simply write queries to transform data and extract meaningful insights. The Azure Data Lake Analytics service can instantly handle jobs of any scale, while letting you define the amount of power you need. It also lets you pay for a job only when it runs.

Related content: Read our guide to Azure Analytics Services

Azure Data Lake Analytics Units (AUs)

The service provides the use of Azure Data Lake Analytics Units (AUs). These units of computation are available to U-SQL jobs. Each AU provides a job with access to a collection of underlying resources, such as memory and CPU. The total price is determined according to the number of AUs reserved for the entire month.

Be sure to carefully allocate the correct number of AUs for your job requirements. You can increase the number of AUs to increase the number of compute resources available for your job. However, an increase in AUs does not increase the inherent parallelism of the job.

Each job has certain characteristics—how much data it can process, its inherent parallelism, and more. If you want your job to run faster while avoiding over-allocation of AUs, you should use Azure Data Lake Tools for Visual Studio. These tools can help you gain visibility into the performance of your U-SQL jobs and estimate an optimal number of AUs.

Pay-as-you-go pricing

In the Central US Region, you pay $2 / hour / Analytics Unit. Charges are calculated per second with no long-term commitment.

Monthly Commitment Packages

You can pre-purchase Analytics Units to receive a discount of up to 74% off pay-as-you-go rates. Here are a few commitment options:

  • 100 AUs prepaid cost $1 per unit (50% discount)
  • 500 AUs prepaid cost $0.9 per unit (55% discount)
  • 1,000 AUs prepaid cost $0.8 per unit (60% discount)
  • 10,000 AUs prepaid cost $0.65 per unit (68% discount)
  • 100,000 AUs prepaid cost $0.52 per unit (74% discount)

If you use more Analytics Units than are included in your prepaid package, you pay an overage of $1.5 per AU, for packages up to 5,000 AUs.

For larger packages, overage price is the same as the reduced price of the package. For example, in a 10,000 AU package, you get a discounted price of $0.65 per unit, and overage is also billed at $0.65 per unit.

Azure Data Lake Cost Optimization with Cloud Volumes ONTAP

NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP capacity can scale into the petabytes, and it supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.

In particular, Cloud Volumes ONTAP provides storage efficiency features, including thin provisioning, data compression, and deduplication, reducing the storage footprint and costs by up to 70%.

Learn more about how Cloud Volumes ONTAP helps cost savings with these Cloud Volumes ONTAP Storage Efficiency Case Studies.