BlueXP Blog

How Azure NetApp Files Supports HPC Workloads in Azure

Written by Jeff Whitaker, Cloud Data Services | Feb 21, 2020 4:18:04 PM

With the advent of cloud services and VM SKUs that offer massive compute power, Azure has become the preferred destination for the deployment and migration of high performance computing (HPC) workloads in the cloud. But in order to support HPC environments on Azure, you need more than compute power; you need high performance, resilient storage.

Azure NetApp Files (ANF), Microsoft’s managed file share service, can be rapidly deployed and scaled to meet application requirements. ANF is the ideal storage service for HPC solutions. 

In this blog, we’ll explore Azure storage for HPC workloads and the challenges faced. We’ll take a look at how to deploy business-critical HPC workloads in Azure through ANF and the many benefits of its enterprise-class data management features, including multiple service levels, on-demand scalability, and out-of-the box high availability.

What Azure NetApp Files Brings to HPC Workloads 

In order for HPC applications to thrive in the cloud, they must be able to access the data layer and retrieve information without disruption or delay. Through its rapid scalability features, Azure NetApp files enables that high level of functionality. Since ANF offers on-demand resources, HPC workloads gain agility in the cloud.

Consider the oil and gas industry, where HPC applications, like Petrel, are used to analyze petabytes of information for geological and reservoir modeling. These outputs depend heavily on high throughput and application storage layer reliability. One such success story is that of Repsol, in which ANF was integrated with Petrel in Azure to deliver performance levels 6.5 times higher than those of on-premises HPE 3PAR storage, thus showing customers can deploy HPC workloads in ANF with confidence. As Repsol Reservoir Engineer Discipline Manager Greg Walker attested, “One of the key elements that was failing us was the storage; Azure NetApp Files was a lifesaver here. We have seen amazing performance increases.”

Let’s look at what you’ll need to run HPC applications in the cloud (and specifically what ANF offers to meet those demands).

Shared Storage Access with High Throughput



If you’re familiar with HPC applications, you’ve likely worked with the parallel processing model, which requires that users have high-speed access to shared data. Using ANF, data can be accessed in parallel by different compute nodes for faster outcomes. By moving your HPC applications to Azure storage, which supports shared access to data, you give your applications bare-metal performance to meet the demands of parallel compute requests.

High Availability of the Data Layer


The data layer has to be highly available so that users have uninterrupted access to the HPC application stack. The storage layer has to have access to multiple paths since all HPC computations depend on access to that data. ANF was built to strengthen applications by making (and keeping) the data layer highly available to applications and users, so there’s little to no configuration needed.

Ease of Management


While on-premises environments require a dedicated storage team to handle storage demands, cloud deployments don’t. It largely eliminates the need for a storage expert to handle configuration, although NetApp experts are on hand if you need a little help. ANF can be provisioned and managed from the Azure portal or through automation scripts like any other Azure service. In short, you don't have to be a storage expert to manage ANF. All you need is an internet connection and the volumes can be spinned up in a manner of minutes and delivered to HPC workloads in connected VNETs.

Scalability Means Flexibility


HPC workload data sets often dramatically increase and decrease in size depending on the use case. Your cloud storage should match your cloud computing resources’ crests and troughs. Through on-demand scaling, ANF meets the scale up and down requirements of your workloads.

Resiliency


Time-sensitive transactions for HPC on Azure should be persisted in order to maintain data integrity. Access disruptions can be catastrophic. Storage must therefore be resilient to transient failures and underlying cloud hardware errors to prevent corruption and data loss. Built on resilient ONTAP storage technology, every provisioned volume is protected from hardware, software, or network failures.

Do I Manage Azure NetApp Files?

Azure NetApp Files managed file share service combines NetApp® ONTAP’s trusted storage management capabilities with the agility of Azure. While built on top of NetApp technology, ANF is offered as a first-party service in Azure sold and supported by Microsoft, simplifying its implementation and management.

Offering multi-protocol support, high throughput with three service tiers, and high availability and scalability, ANF is a one-stop solution for your file share requirements in Azure. Any HPC application environment, including databases, DevOps, big data, and analytics can benefit from ANF’s capabilities.

ANF is provisioned and managed using the same processes and tools as any other Azure cloud service. It eliminates the need for building out full-fledged SMB/NFS file clusters for provisioning the file shares. The overhead involved in terms of design, implementation, and ongoing maintenance of such NFS/SMB file clusters far outweigh the benefits of the approach. ANF also supports NFSv3 for legacy applications, NFSv4.1 for modern Linux Azure workloads, and SMB protocols for Windows servers. Moreover, its three service levels make it easy to meet varying workload demands.

In addition, ANF can be integrated with NetApp’s Cloud Sync service for efficient transfer of large data sets to the cloud for big data analytics applications. Moreover, ANF enables faster data access for HPC applications that require a stable, high-throughput storage layer.

The Nitty-Gritty of Azure Storage with NetApp Technology

Meeting the ever-changing storage demands of an HPC environment is no easy task and requires a specialized storage solution. ANF offers capabilities far exceeding manually built file share services in Azure and at a considerably lower cost.

Let’s take a closer look at the advantages we mentioned above.

Out-of-the-Box High Availability

Based on NetApp ONTAP® technology and deployed in Azure data centers as a native service, ANF’s built-in high availability eliminates the need for additional availability configurations. Once provisioned and attached to the HPC workloads, the ANF volume will remain highly available throughout the application lifecycle.

Single Management Pane

ANF can be provisioned, configured, and managed from the Azure portal or using automation tools like Azure CLI, Azure PowerShell, and RestAPI. Compute, storage, and network components of HPC applications in Azure can be managed from a single pane. And access to the ANF management pane can be controlled through Azure RBAC, as with other Azure services, requiring no learning curve.

Scale on Demand

ANF is capable of scaling from a minimum of 100 GB to a maximum size of 100 TB per volume with just a few clicks from the portal or through Azure CLI. This eliminates the need for proactive capacity planning to accommodate growing data sets. Capacity can be scaled on demand with minimal administrative overhead.

Storage Durability

Azure NetApp Files offers 99.9999999% data durability and meets the resiliency requirements of HPC workload data layers. There’s also no need to worry about the data integrity of HPC workload databases hosted on ANF, as the storage is protected from downtime through NetApp’s highly trusted storage technology in the Azure data centers.

Support for Shared Access

A single ANF volume can be mounted to many machines at the same time. This shared access feature is crucial for HPC workloads, where several computing nodes are used to process the data in parallel. Such wide-scale access of high-performance file shares reduces deployment cycles and delivers faster results.

Service Levels

Designed to operate at sub-millisecond latencies, ANF enables HPC workloads to operate at performance standards that even exceed those of on-premises. ANF volumes can be configured to use one of three service levels, with the Standard tier supporting a throughput of 16 MB/s, the Premium 64 MB/s, and the Ultra tier 128 MB/s per provisioned terabyte.

The performance level can be changed on demand based on the varying workload requirements of your HPC environments for optimal cost-performance balance.

The ANF Advantage for Your HPC Applications in the Cloud

Deploying HPC applications in Azure requires careful planning of compute, storage, and networking components, with storage playing a critical role; it is the service’s capabilities in terms of availability, scalability, durability, and performance that determine how efficient your HPC workloads will be.

Take your HPC cloud capabilities to the next level. Subscribe to Azure NetApp Files today.

More HPC on Azure content: