BlueXP Blog

AWS FSx for Lustre vs EFS: Head to Head

Written by Yifat Perry, Technical Content Manager | Nov 1, 2021 9:40:39 AM

What Is AWS FSx for Lustre?

What Is AWS EFS?

FSx for Lustre is an entirely managed service that offers high-performance, scalable, cost-effective storage for compute workloads. A lot of workloads, including high-performance computing, financial simulation, video rendering, and machine learning, are reliant on compute instances having access to the same data via high-performance joint storage.

 

AWS FSx for Lustre provides sub-millisecond latencies, millions of IOPS, and throughput of as much as hundreds of gigabytes per second. It offers several deployment types and storage options to maximize performance and cost for your workload needs.

Amazon Elastic File System (EFS) is a set-and-forget, serverless, simple elastic file system which may be used with on-premises resources and AWS cloud services. It is created to scale on demand (petabytes) and doesn’t disrupt applications. It grows and shrinks automatically as you remove or add files, doing away with the requirement to manage and provision capacity to allow for expansion.

 

The Amazon EFS web services interface lets you develop and configure file systems simply. The service deals with all the file storage infrastructure on your behalf, so you don’t need to handle the deployment, maintenance and patching of intricate file system configurations.

In this article:

AWS FSx for Lustre Features

Here are some of the important features of FSx for Lustre.

High Performance and Scalability

FSx for Lustre delivers the performance needed to meet a broad range of high-performance workloads. The Lustre file system is maximized for data processing, featuring throughput which scales to hundreds of gigabytes every second and sub-millisecond latencies.

You can easily retrieve high-performance joint storage from tens of thousands of compute instances. You can also scale up storage capabilities as needed.

Linux Workload Compatibility

FSx for Lustre is designed to support any Linux workload and is POSIX-compliant. This allows you to work with your existing Linux-based applications without implementing any modifications. FSx for Lustre has a native file system interface and functions with the Linux operating system. It supports file locking and offers the ability to view changes (read data) after you make such changes (write data). You can access your file system from the cloud (using Amazon EKS clusters or Amazon EC2 instances), or on-premises (via AWS VPN or AWS Direct Connect).

Security and Compliance

FSx for Lustre automatically encrypts your information in-transit and at-rest, and is ISO, SOC, and PCI-DSS compliant. FSx for Lustre is HIPAA compliant. Network access can be managed via POSIX permissions or Amazon VPC Security Group specifications. To adhere to data protection specification, you may also employ AWS Backup for compliance management and centralized backup for the persistent FSx for Lustre file systems not connected to an S3 durable information repository.

Learn more in our detailed guide to FSx for Lustre

AWS EFS Features

Here are some of the important features of Amazon EFS.

Storage Flexibility

EFS provides two classes of storage for files—One Zone and Standard. These classes are intended to manage the least often and most often retrieved data. EFS Lifecycle Management allows you to minimize costs by allowing for the automatic transfer of files that are infrequently retrieved from Standard storage to EFS OneZone-Infrequent Access or to EFS Standard-Infrequent Access.

Throughput Mode

Amazon EFS provides two modes of throughput: Provisioned and Bursting. The throughput mode works to establish the general throughput a file system can attain. Regarding Bursting Throughput, the throughput scales along with the file system’s size, dynamically bursting as required to provide support to the spiky character of various file-based workloads.

Provisioned Throughput is created to support applications that demand greater dedicated throughput than the Bursting mode default and may be configured separately to the volume of data retained on the file system.

Storage Class and Lifecycle Management

Amazon EFS provides One Zone and Standard storage classes for files that are both frequently and infrequently accessed. The One Zone and Standard storage classes are optimized for performance to ensure ongoing low latencies.

The Amazon EFS One Zone-Infrequent Access (EFS One Zone-IA) and the Amazon EFS Standard-Infrequent Access (EFS Standard-IA) storage classes are optimized for cost for files that are used less often. You may begin cutting down on storage fees by enabling EFS Lifecycle Management for your file system and opting for an age-off plan.

When Should You Use AWS FSx for Lustre vs AWS EFS?

Amazon EFS is created to offer IOPS, throughput, and low latency required for a variety of workloads. With Amazon EFS you may select from two throughput modes and two modes of performance:

  • The default General Purpose performance mode—suitable for latency-sensitive application areas, such as content systems, web serving environment and basic file serving. File systems in the MaxI/O mode may scale to greater levels of aggregate operations and throughput for each second. The tradeoff is a little higher latency for operations involving file metadata.

  • Utilizing the default Bursting Throughput mode—throughput scales while the file system is growing. Provisioned Throughput mode allows you to state the throughput of your file system separate from the volume of data retained.

Learn more in our detailed guide to AWS EFS Performance

Amazon FSx for Lustre is useful for dealing with heavy workloads that require highly available and durable storage:

  • High Performance Computing (HPC)—these workloads store, analyze and process huge amounts of information. They are processing-heavy and run for extended periods. Thus, they require very dependable storage to persist information. Examples include machine learning, autonomous vehicles, genomics, financial modeling, research, EDA and more.

  • Persistent storage for Containers (Amazon EKS, Kubernetes Storage)—you should use the persistent file system deployment choice via Amazon EKS clusters for containerized machine learning and HPC workloads or self-managed Kubernetes. As containers are immutable, when a container stops working, the data developed throughout its lifetime is irretrievable. Persistent file system is suitable for applications which want the data to exist past the container’s lifetime.

  • Data lakes on S3—organizations hosting data lakes via Amazon S3 may readily spin up an FSx for Lustre file system connected to their S3 prefix or bucket. This ensures that storage is readily accessible to compute. FSx for Lustre persistent file management system functions like a fast caching layer without adapting your applications, so that analytics jobs can run quicker while minimizing compute costs.

Amazon FSx for NetApp ONTAP

In collaboration with NetApp, AWS has launched Amazon FSx for NetApp ONTAP, a new cloud-based managed shared file and block storage service that brings the best of both worlds to their customers.

FSx for ONTAP delivers NFS, SMB and iSCSI storage powered by NetApp’s advanced data management system, with features and benefits that go beyond other AWS offerings:

Click here for a step-by-step walkthrough on how to set up your own FSx for ONTAP environment with BlueXP Console.