hamburger icon close icon

Azure Latency Reductions with Cloud Volumes ONTAP and NVMe Caching

Latency is a clear measure of the efficiency of your storage layer, no matter which cloud you’re using. Deployments in Azure can benefit from reduced storage latency thanks to Cloud Volumes ONTAP’s intelligent NVMe caching feature.

Caching helps achieve lowest possible latency for your workloads hosted in Azure, thereby enhancing user experience especially for highly transactional services like databases. In this blog we will explore some of the causes of latency and how NVMe caching with Cloud Volumes ONTAP can help improve your application performance in Azure.

Use the links below to jump down to read about:

Latency: Contributing Factors

In the most general terms, latency refers to the time taken for a system to complete an action or to provide an output for a transaction. The related term throughput refers to the number of outcomes or actions that the system is capable of delivering in a specific unit of time. The performance of a storage system is dependent on both latency and throughput. If we correlate this with bandwidth, which indicates the data pipeline size, throughput is the measure of data that can traverse the pipe. As latency increases, the throughput goes down. This could result in under-utilized bandwidth and low performance of the storage system.

There are other factors that contribute to latency in the public cloud:

  • Multiple hops: Data traverses multiple hops in a network before reaching a target in the cloud. Top down, the architecture might require a transaction to be sent from a web server to application worker servers before finally reaching the database layer.
  • Hardware constraints: The hardware supporting the servers, or their virtualization layer would have associated CPU-, memory-, network- and storage-related constraints that could add to the overall latency. There are multiple devices in the network path on the internet—routers, firewalls, IDS/IPS systems, etc.—that the data travels through before reaching its destination. As data is processed or analyzed in those devices, it adds to the overall latency.
  • Suboptimal routing: The path selected by the routing devices for data to reach the destination isn’t always the most optimal.
  • Physical infrastructure: Last but not least, the physical medium—i.e., the wires and cables that transmit the data—also have limits associated with them that could increase the latency.

In a cloud deployment many of these components are abstracted from the users. However, latency can still be optimized by choosing the right cloud tools and services to ensure a good user experience.

Understanding Cloud Latency

The cloud adds nuances while managing latency. Azure provides different tuning options for compute, storage and network elements that will help manage cloud computing latency effectively.

Before we dive into the details of how to solve your cloud latency challenges, it’s important to understand the main factors that contribute to latency in the cloud in the first place:

  • The cloud, under the hood: The cloud itself is made up of physical resources managed in data centers owned by the public cloud service providers. The geography from which users are accessing and how close they are to the cloud resources will impact the latency and user experience.
  • Azure connectivity: Connectivity from your data center to Azure can be through VPN or through Azure ExpressRoute. VPN traverses over the internet and all associated latencies would still impact the user experience. ExpressRoute on the other hand offers dedicated bandwidth and direct connectivity to Azure data centers, thereby improving the latency.
  • Azure VNet architecture: The overall architecture within the Azure VNet—which could include perimeter devices like WAF, Azure firewalls, internal load balancers within application tiers, etc.—also adds to the number of hops your data needs to make, thereby increasing latency in multitier applications.
  • Storage layer: The storage used by your cloud applications also plays a key role in latency. Azure disk storage can use both HDD and SSD drives in the backend. HDD drives could help reduce the overall cost but support less IOPS when compared to SSD.

What Is Latency in Azure?

Azure latency, in the simplest terms, is the time it takes for applications to respond to requests that are made by users. All the factors that we discussed above would contribute to this latency. Along with optimizing the number of hops in the application architecture in the cloud, you could considerably reduce the latency by choosing the right storage layer for your workloads in Azure. This is one of the main tuning options that you should focus on in the entire design.

How Do I Check Azure Latency?

Customers can use tools like SockPerf (for Linux) and latte.exe (for Windows) to measure latency. You could also leverage third party sites like Azure Speed Test 2.0 that can be used to run Azure latency tests from your browser location for storage services on different Azure data centers. Note that these latency checks may not be conclusive for your application. You should also consider reviewing the metrics from Azure monitor to understand the read/write performance of storage.

Improve Azure Cloud Storage Latency by Using Cloud Volumes ONTAP’s NVMe Intelligent Caching

Cloud Volumes ONTAP is NetApp’s flagship data management solution that brings the trusted ONTAP data management technology to Azure, AWS, and GCP. On Azure, it leverages Azure disk storage in the backend and enhances the return on investment through added storage efficiency, data protection, high availability and flexibility.

While HA systems use Premium page blobs, Azure managed disks are used in single-node Cloud Volumes ONTAP systems. Azure Premium SSD, Standard SSD and Standard HDD can be used as underlying cloud storage for Cloud Volumes ONTAP:

  • SSD-managed disks provide higher IOPS, of which Standard SSD is recommended for workloads with consistent performance requirements and Premium SSD for workloads that need higher IOPS.
  • Standard HDD should be used in scenarios where higher IOPS is not required and you want to optimize costs.

Disk sizing plays an important role in the overall performance of Azure premium disk. The IOPS and throughput is capped based on the SKU being used. Larger SKUs provide better IOPS and performance but are more expensive. For example, a disk of size 512 GB has an IOPS cap of 153k, while a disk of size 256 GB has an IOPS cap of 76k.

How Do I Lower my Azure Latency?

The Cloud Volumes ONTAP configuration can leverage NVMe local storage to reduce latency and improve response time of applications. Cloud Volumes ONTAP uses an intelligent caching method where recently accessed user data and NetApp metadata is cached in NVMe for faster read operations. Intelligent NVMe caching is supported for Standard_L8s_v2 SKU in single-node Cloud Volumes ONTAP using BYOL option in Azure. Note that compression should be disabled in Cloud Volumes ONTAP volumes to benefit from performance improvement offered by intelligent NVMe caching.

NVMe intelligent caching greatly accelerates the performance of workloads with heavy read transactions. As the data is read from the cache the round trip latency is reduced. The underlying disk IOPS can be used to process write requests. Both read and write requests can have faster turnaround times, thereby improving the overall application performance and user experience. It is best suited for use cases like transaction intensive databases, performance-sensitive LOB applications, file service, email systems, and other critical business components.


Latency reduction is a non-negotiable aspect while designing production systems in the cloud. Managing the read and write operations efficiency of the storage is a crucial part of this process. If not managed properly, a high latency could result in bad user experience and thereby loss of business. Cloud Volumes ONTAP helps address this pain point by reducing the latency of read and write operations of the underlying storage layer through NVMe intelligent caching.

Intelligent NVME caching that reduces the overall Azure storage latency is just one of the benefits offered by Cloud Volumes ONTAP. It also provides additional value through features like data protection through NetApp Snapshot™ technology, SnapMirror® data replication, data security through encryption, and a full host of storage efficiency features. The high availability configuration of Cloud Volumes ONTAP ensures that the nodes are deployed in availability sets to protect your data from Azure data center failures. This configuration supports an RPO of 0 seconds and RTO of less than 60 seconds. Cloud Volumes ONTAP also makes it easy to configure and manage multiple environments through a unified interface provided by NetApp Cloud Manager.

New call-to-action
Robert Bell, Product Evangelist

Product Evangelist