hamburger icon close icon
HPC on Azure

What is Cloud Performance and How to Implement it in Your Organization

What is Cloud Performance?

Cloud performance monitoring and testing tools help organizations gain visibility into their cloud environments, using specific metrics and techniques to assess performance.

Efficient cloud performance is critical for maintaining business continuity and ensuring all relevant parties gain access to cloud services. This is true for basic cloud usage of public clouds and complex hybrid cloud and multi-cloud architectures.

Cloud performance metrics enable you to effectively monitor your cloud resources, to ensure all components communicate seamlessly. Typically, cloud performance metrics measure input/output operations per second (IOPS), filesystem performance, caching, and autoscaling.

This article is part of our series of guides to HPC on Azure.

In this article, you will learn:

Cloud Computing Performance Metrics

There are various metrics that can help you monitor and assess the performance of your cloud computing resources, including IOPS, filesystem performance, caching, and autoscaling.

Input/Output Operations per Second (IOPS)

IOPS measures the read and write speed of your storage. It is based on variables like the configuration of the disk array, sequential or random data patterns, data block sizes, and the ratio of write and read processes. IOPS values serve as the performance benchmarks for storage devices, and impact the performance of the servers hosting the devices.

File Storage Performance

There are two primary options for managing storage in the cloud:

      • Running virtual machines and attaching block storage volumes to them - for example, Azure managed disks
      • Using managed storage services, such as Azure Files or Azure NetApp Files

Because these storage systems interact with applications, they significantly impact cloud performance. You should monitor metrics like latency, IOPS on storage volumes or services, and storage capacity vs. limits on the volume or service.


The purpose of caching is to improve storage access performance. To achieve this, caching techniques temporarily store data using RAM pools within the compute nodes. This process is implemented before the data is read from or written to a storage device. The data can then be read directly from the disk—this is called cache memory.

Cache memory provides quick access to frequently used files. Because the cache uses RAM, it has faster access rates than disk read operations. When the filesystem needs data, the cache will read it, preventing longer disk read operations. To ensure efficient disk performance, caching solutions orchestrate the process, optimizing performance as needed and freeing up central processing units (CPUs).


Autoscaling processes are responsible for either decreasing or increasing the provisioning of resources. There are two types of autoscaling:

      • Vertical scaling—the process of scaling up, during which you add resources like CPU instances or RAM. You can scale up your network, storage, and compute capabilities. Vertical scaling typically translates into better performance.
      • Horizontal scaling—the process of scaling out, during which you add more nodes. This means you increase the amount of servers in your current configuration.

Each cloud vendor provides different scaling options and configurations. Before scaling, check with your vendor to determine costs and specifications for each type of system.

Related content: read our guide to HPC performance

Cloud Performance Testing

Cloud performance testing lets you test various performance metrics, such as system throughput and latency. Typically, each test checks different aspects of performance, including:

      • Stress testing—checks the reliability, stability, and responsiveness of your cloud resources when put under an extremely high load.
      • Load testing—checks if the system performs well when multiple users try to use the system simultaneously.
      • Browser testing—determines browser-system compatibility.
      • Latency testing—measures the time needed to move data messages from one point in the network to another.
      • Targeted infrastructure testing—checks for system issues. The process isolates each application component or layer and checks their ability to deliver required performance.
      • Failover testing—checks whether a system is capable of calling additional resources during heavy traffic or usage peaks. This test can help prevent interruptions that negate user experience.
      • Capacity testing—can help you identify and benchmark the maximum traffic or load amount that your cloud system can handle efficiently.
      • Soak testing—measures system performance during long periods of heavy traffic. You typically run this test to ensure optimal behavior in production environments.

Cloud Performance Techniques

Select Appropriate Instances

The instance type you choose directly impacts performance. Cloud vendors offer a wide range of instance types, each providing a specific set of components for memory, networking, and storage. Some instances are optimized for memory-intensive applications while others are ideal for compute-intensive workloads. Choosing the appropriate instance for your purposes can help you maintain optimal performance.

Related content: read our guide to legacy apps

Cloud Auto Scaling Services

Auto-scaling services automatically scale resources, to ensure services are not interrupted during peaks of demand. Many cloud vendors offer auto scaling features built-in or via extended services. Often, there are default configurations or templates that can help serve as guidelines, but it is typically best to refine these policies to ensure auto scaling is performed according to the unique needs of your environment.

Cloud Caching Services

Cloud catching services help improve performance by storing copies of frequently used data for quick access. The majority of cloud vendors offer cloud catching capabilities, including Google’s App Engine Memcache, Amazon’s ElastiCache, and Azure’s Cache for Redis. To ensure continuity between principal data stores and the cache, you need to define processes that expire and update cache content. You can reference vendor documentation when creating your design.

Cloud Performance Monitoring

Cloud monitoring can help you gain visibility into your cloud performance, providing you with metric-based insights. You can then leverage this information to fix performance during incidents, and assess how systems perform during certain periods of time. You can then use these insights to continuously optimize performance and relevant expenses.

Cloud Performance with Azure Netapp Files

Azure NetApp Files is a Microsoft Azure file storage service built on NetApp technology, giving you the file capabilities in Azure even your core business applications require.

Get enterprise-grade data management and storage to Azure so you can manage your workloads and applications with ease, and move all of your file-based applications to the cloud.

Azure NetApp Files solves performance and availability challenges for enterprises that want to move mission-critical applications to the cloud, including workloads like HPC, SAP, Linux, Oracle/SQL Server workloads, Windows Virtual Desktop, and more.

In particular, Azure NetApp Files allows you to migrate more applications to Azure–even your business-critical workloads–with extreme file throughput with sub-millisecond response times.

Cloud Data Services