April 28, 2023
Topics: Cloud Volumes ONTAP File ServicesAWSAdvanced8 minute readCaching
Cloud computing has revolutionized the way we store and access data. However, one of the challenges of using the cloud is the need for efficient data transfer. Transferring data between the cloud and local storage can be slow and costly, especially for large files. This is where file caching comes in.
File caching is the process of storing frequently accessed data on local storage, reducing the need for data transfer between the cloud and local storage. This can significantly improve the performance of applications that rely on cloud storage, especially those that require fast access to data.
In this article, we’ll take an in-depth look at the Amazon File Cache service including how it works and its use cases.
Jump down to a specific topic in this article:
- What is Amazon File Cache?
- Key Amazon File Cache Use Cases and Benefits
- How Does Amazon File Cache Work?
- Cloud Volumes Edge Cache Caching Solution
- Summary and Key Takeaways
What is Amazon File Cache?
Announced during the AWS Storage Day 2022, the new Amazon File Cache service rapidly became generally available to AWS customers. Amazon File Cache is a hassle-free, managed caching service that can process files scattered across different AWS storage locations and on-premises environments with a high data access speed.
The Amazon File Cache service widens the AWS storage service offering and enriches the capabilities of the existing AWS built-in service. Amazon File Cache works out of the box with Amazon FSx for NetApp ONTAP, Amazon FSx for OpenZFS, Amazon S3 buckets, or any network file systems (NFS) storage location.
Key Amazon File Cache Use Cases and Benefits
Regardless of the chosen file cache technology, the core purpose of Amazon File Cache remains the same: to make it easier and faster to process data in the most suitable location possible. It can be used within the AWS ecosystem and infrastructure to enable temporary and highly performant storage for data that resides on-premises or within AWS. This enables cloud architects to design a more flexible solution, has lower operational overhead, and has an overall greater user experience thanks to high data access speed.
Amazon File Cache solves several technical problems that architects have when designing solutions in industries such as media, finance, energy, health, and manufacturing. Organizations store large datasets spread across different locations and environments and then have requirements to process them in a fast and efficient manner.
If you’re an AWS customer, introducing Amazon File Cache to your architecture brings several benefits and fulfills many use cases:
Media compute-intensive workloads
Media compute-intensive workloads such as video and audio rendering and transcoding require access to media files via fast and highly performant storage. Media files in AWS S3 buckets or on-premises locations can be made available via Amazon File Cache to these temporary, and costly, computational workloads running on AWS Batch or EC2 instances. This enables the workload to be completed faster, therefore reducing costs and decreasing the need to maintain multiple duplicate file copies across storage mediums.
Machine learning algorithm training
Training machine learning (ML) algorithms requires instant access to datasets that often reside on-premises or that are spread across multiple AWS S3 buckets. This data needs to be made available to the training instances, usually in AWS Sagemaker, to be able to maximize its computational throughput and the velocity of the MLOps pipeline. This decreases the data engineering effort needed, such as copying files or developing custom logic and maintaining temporary ephemeral storage volumes, to make the datasets available for machine learning training.
Cloud bursting for advanced data analytics
This is a popular use case for organizations that have stored large datasets on-premises and want to leverage the cloud for business intelligence and analytics. Organizations with petabytes of on-premises data can leverage Amazon File Cache to automatically cache only the subsets needed to execute data transformations and advanced analytics using modern cloud services such as AWS Elastic Map Reduce and AWS Glue.
How Does Amazon File Cache Work?
Amazon File Cache is available across all the main AWS global regions. Within a few minutes, customers can create a File Cache and link it to their preferred data source. Amazon File Cache can be linked to one or more on-premises file systems (NFS) environments or AWS-native storage services such as Amazon FSx for NetApp ONTAP, Amazon FSx for OpenZFS, or Amazon S3 buckets.
As an object storage, AWS S3 doesn’t provide native caching capabilities nor the possibility of using it as a block-level file system. While there are some alternative methods to enable this, the performance drawbacks are significant. With Amazon File Cache, customers can leverage data in Amazon S3 using their regular application workloads without having to change the workflow.
Amazon File Cache loads data from on-premises or cloud storage services into the cache automatically the first time data is accessed by the workload. This simplifies the need to implement custom methods to duplicate data or plan the exact files that will be needed in the cloud environment in advance. Acting as an AWS file gateway cache, data movements happen out of the box and on demand as required by the cloud workload.
As a fully managed service, Amazon File Cache pricing follows a pay-as-you-go model. This means customers are billed based on the amount of storage capacity their cache uses (measured in GB per month) and the amount of data transferred within AWS Regions, Amazon services, or out of AWS.
With Amazon File Cache, it's easier to access multiple data sources from a single, unified data access point. Amazon File Cache offers sub-millisecond latency performance capable of millions of operations per second. With a throughput of hundreds of GB/s, this gives a significant boost to workload completion times and allows you to optimize the utilization of compute resources.
Cloud Volumes Edge Cache Caching Solution
Cloud Volumes Edge Cache (CVEV) solution combines NetApp’s leading Cloud Volumes ONTAP storage management system with NetApp’s premier Global File Cache service, and works both natively with AWS services and also across other environments such as on-premises infrastructure or Microsoft Azure and Google Cloud.
Enterprise customers understand the importance of having control over their organization's data while ensuring its availability and protection. Managing distributed storage can be a headache, but NetApp CVEV makes it easier for ITOps to consolidate and centralize data into the public cloud while getting the scalability and performance of an enterprise-grade storage solution.
With Cloud Volumes Edge Cache, you have a central point of control that makes it easier to manage, plan, take action, and avoid disruptions caused by distributed storage. With built-in enterprise-grade data protection and out-of-the-box managed capabilities, it ensures data is available to you and your end-users in the most cost and operationally efficient manner.
Keeping your data safe is essential, and NetApp CVEV makes it hassle-free by providing unlimited backup to the cloud for all customers. You can avoid the headache of managing backups, worrying about data breaches or system failures, and rest assured that your data is protected and under control.
Summary and Key Takeaways
File caching is an effective solution to improve the performance of applications that rely on cloud storage. Amazon File Cache can be used to accelerate access to frequently accessed data. By utilizing Amazon File Cache within the AWS ecosystem, organizations can improve application performance, reduce data transfer costs, and ensure that services such as AWS EMR, Glue, and Sagemaker, among many other Amazon services have fast and efficient access to data.
Organizations that require an enterprise-grade service capable of fulfilling more advanced and complex use cases, such as integrating with multiple cloud providers, should consider trying NetApp Cloud Volumes Edge Cache.
NetApp Cloud Volumes Edge Cache provides a scalable, efficient, and cost-effective file caching solution for consolidating and centralizing data into the public cloud while ensuring data protection, efficient collaboration, and business continuity. It’s an excellent solution for storage administrators and AWS architects who need a solution to manage distributed storage efficiently and cost-effectively.
You can learn more about NetApp Cloud Volumes Edge Cache here.
What is AWS file cache?
The AWS ecosystem has different caching capabilities and methods within its different built-in services. The Amazon File Cache is a dedicated and built-in service that can provide file cache capabilities by connecting to existing on-premises or AWS storage services such as S3 and FSx for NetApp ONTAP.
Does S3 cache files?
The Amazon Simple Storage Service (S3) doesn’t have native caching capabilities. However, this can now be accomplished when combined with the Amazon File Cache service.