Amazon EFS (Elastic File System) provides cloud-based NFS file share services that are highly available and scalable. It is very quick to get started, allowing you to create a new filesystem share in minutes. Each filesystem can be accessed by hundreds or thousands of client machines and applications concurrently. This has a wide range of uses from consolidating data from multiple sources for data analysis to creating content management systems.
On AWS, migrating data to and from Amazon EFS can still pose challenges. Although Amazon EFS File Sync can be used to copy files into Amazon EFS from Amazon EC2 and on-premises systems, data cannot be copied out in the reverse direction. For bi-directional transfer from on-premises systems, an AWS Direct Connect connection is required in order to mount the filesystem locally, as VPN connections are not supported.
Cloud Sync has been designed to solve all issues related to data transfer and data synchronization. As part of NetApp’s Cloud Data Services, Cloud Sync makes moving data between any source and destination a simple task while keeping it synchronized in a robust and secure manner.
With Cloud Sync, you can ingest data into Amazon EFS from any NFS share, whether hosted on NetApp systems or not, as well as work with other types of storage, such as CIFS shares and Amazon S3. Cloud Sync also facilitates data synchronization in the reverse direction out of Amazon EFS back to NFS systems.
In this article we will start by describing how Cloud Sync works and the features it provides with migrating data to and from Amazon EFS, and then explore how Amazon EFS and Cloud Sync can be used together to great effect to protect Amazon EFS filesystems.
Cloud Sync is a SaaS (Software-as-a-Service) solution for data migration between any source and destination platform. After performing an initial baseline copy of the full data set, Cloud Sync will incrementally synchronize only the data that has changed, which makes it very efficient, especially when working with large datasets.
As Cloud Sync is a service solution, there are no software or agent installations to perform and users can start migrating data within minutes after signing up. A simple and fully-featured web-based UI guides you through the process of setting up synchronization relationships, and from the dashboard you can view the current status of all relationships, check audit logs, and perform other administration functions. DevOps users also have the option to integrate Cloud Sync with a wider workflow by making use of its RESTful API.
The Cloud Sync service works by performing data migration and synchronization operations through the use of a data broker instance. This instance can be created in the cloud, using Amazon EC2 on AWS, or using an on-premises or Microsoft Azure virtual machine. In either case, the Cloud Sync UI simplifies the process by helping you create the data broker.
Selecting a source and target in Cloud Sync.
Using Cloud Sync has many benefits:
While creating and using Amazon EFS filesystems is very quick and easy to do, backing up existing shares that have been around for a while is a little more involved. The approach documented by AWS involves the deployment of an AWS Data Pipeline that uses a workflow provided by Amazon through github.
This solution effectively copies the live Amazon EFS data to a secondary, backup Amazon EFS filesystem. It allows you to maintain a rolling set of backups and an additional workflow template is provided to restore the data back to the production environment.
Amazon EFS File Sync allows for data to be migrated into Amazon EFS from NFS file shares hosted on-premises or in Amazon EC2. Amazon EFS File Sync does not support copying data back out of Amazon EFS, and therefore cannot be used as a backup solution. Amazon EFS File Sync uses an intermediary agent to manage synchronization operations, which provides a text-based console interface for configuration and setup.
Cloud Sync has the advantage of being a service offering and therefore requires no manual setup. The modern, web-based user interface provides users with easy access to all functionality, including the setup of the data broker instances used to facilitate data migrations. As a general solution to data migration and synchronization, Cloud Sync supports a broader scope and can be used to work with other storage systems, such as CIFS shares and object stores, or perform cross region replication, for example.
After the initial full backup, Cloud Sync is able to keep the data synchronized incrementally, by copying over only the changes made since the last sync. Not only does this make the synchronization process much faster and hugely more efficient, it is also very useful when the destination system supports point-in-time snapshot copies. By creating such a snapshot when Cloud Sync finishes synchronizing, you automatically get a rolling set of backups.
As we’ve seen in the course of this article, Cloud Sync makes data synchronization easier, faster, and more robust than homegrown solutions. When working with Amazon EFS in particular, Cloud Sync can be a lifesaver when migrating to and from Amazon EFS and when a backup solution is required, providing added flexibility and ease of use.
If you’re ready to take the next step in data synchronization with NetApp, Cloud Sync is available now as a free 14-day trial through the AWS Marketplace or in the NetApp Cloud Central.