Subscribe to our blog
Thanks for subscribing to the blog.
Disasters are a major concern for any enterprise IT deployment. If something goes wrong—whether it’s a natural disaster or a careless mistake that takes down your system—you need a way to recover, quickly and without data loss. Amazon FSx for NetApp ONTAP can help you address this customer challenge.
In this article, we’ll take a close look at how you can create a disaster recovery (DR) copy by using FSx for ONTAP, whether you’re trying to protect an on-premises NetApp® ONTAP® system, an existing FSx for ONTAP file system, or another on-prem or cloud-native storage system.
Read on as we cover:
- The Enterprise DR Challenge
- The Benefits of Using FSx for ONTAP for DR
- Case Study: How a Healthcare Service Provider Leverages FSx for ONTAP for Cloud-Based DR
- Beyond Disaster Recovery: Other Benefits of FSx for ONTAP
- It’s Your Data. Here’s How You Keep It Safe.
The Enterprise DR Challenge
Given the complexity, size, and criticality of the systems involved, implementing a DR solution needs to be done with care and attention. Let’s look at some of the common challenges faced by enterprises while implementing DR:
- Managing RPO and RTO
When designing DR strategies, there are two important numbers to define. Your system’s recovery time objective (RTO) represents the acceptable time within which services need to be back online to avoid major repercussions—a number that should always be kept to a minimum. Your system’s recovery point objective (RPO) represents the acceptable data loss during a downtime. That number also needs to be kept to a minimum.
- Data replication
For primary data hosted in either on-premises or cloud environments, moving data to the DR copy requires an efficient replication solution.
That solution needs to integrate with the source environment and have tools to make sure that transfers between the source and the target are efficient and nondisruptive, with no data loss. Data transfer costs are also a factor to keep in mind here.
- Operational overhead
As an environment gets more complex, so does the DR solution that protects it. DR solutions that work at the enterprise level often involve configuring replication, failover, failback, and periodic testing.
If that weren’t enough to worry about, there are other factors to consider, such as network connectivity, application consistency, and storage efficiency management. It’s also extremely important for the DR process to avoid affecting the production workload’s performance.
The DR solution should be simple and easy to manage in order to avoid operational overhead, loss of data, and other complications.
- Failover and failback
If a primary site fails, the DR solution should be able to fail over an application to the DR site. When the primary site is back online, the solution also needs to be able to synchronize any changed data before initiating a failback.
If failover and failback aren’t managed properly, this can be a disaster in itself. You could wind up with data inconsistencies or corruption, especially in environments with a high data change rate.
Note that the process should also be completed to align with your defined RPO and RTO and maintain storage efficiency, so your costs don’t spike between the changeovers.
- Recovering data—of all kinds—quickly
Disasters come in all shapes and sizes. Your DR tools should be able to protect from failures at any scale.
Sometimes, that could mean recovering files, folders, or even entire data volumes. Coarse-grained, file-level replication solutions often need recovery of entire machines, which could negatively affect the recovery timelines.
To be effective, a good DR solution will be able to provide point-in-time, granular recovery of data to quickly bring your application back online.
- Disasters can take out entire regions
Catastrophic disasters happen, too. Your workloads deployed in the AWS Cloud should be designed to survive outages that could take down an entire AWS region.
Implementing a cross-region DR solution involves coordinating services and resources across regions. It’s better to use native services than third-party solutions simply because one more element in the mix presents another point of failure.
- A high-availability DR environment
Your primary system runs business-critical enterprise applications, and highly available storage is key to maintain access for your end users and always maintain business continuity.
In case of a disaster and failover, your DR environment also needs to rely on highly available storage.
Your DR copy is going to duplicate the amount of cloud storage space consumed and add those costs to your overall IT budget. Plus, that data will (hopefully) never be used.
You also need to consider the costs of data transfers when replicating the data to the DR environment. Such costs will be even higher if you’re trying to replicate across regions.
To save costs, we recommend that you reduce your storage footprint as much as possible by applying storage efficiencies and other optimization techniques and carry them over to the DR copy.
These challenges might seem insurmountable when taken together, but with FSx for ONTAP you get a solution that combines all the enterprise-grade DR capabilities of NetApp ONTAP in an AWS service. Here’s how we do it.
The Benefits of Using FSx for ONTAP for DR
FSx for ONTAP can be a key part of your DR strategy—whether your primary system uses ONTAP on premises, FSx for ONTAP, or another non-NetApp system on premises or in the cloud.
No matter which way you choose to use the service, FSx for ONTAP incorporates several signature NetApp ONTAP storage management and DR capabilities:
- Keep copies constantly in sync
You can use NetApp SnapMirror® data replication capabilities to replicate data asynchronously to the DR site. The replication is block-level and uses an incremental approach to keep the data in the DR site updated. It uses NetApp Snapshot™ technology in the back end, where a first full copy of the data is created instantly and doesn’t consume additional capacity, followed by delta replication of changed data, making the process efficient and fast.
The replication can be done between on-premises ONTAP and FSx for ONTAP, or between two FSx for ONTAP instances, and can synchronize according to schedules you define. For environments based in AWS, the replication can either be in-region or cross-region. If the source is non-NetApp, the data can either be migrated or simply replicated to an ONTAP-based system first, preferably FSx for ONTAP, after which the DR copy can be easily created.
Because this replication is a built-in FSx for ONTAP capability, you don't need to configure any additional replication servers to start using it. It’s ready to go as soon as you need it.
Snapshot technology, the secret behind SnapMirror, creates point-in-time, read-only copies of FSx for ONTAP data volumes. These copies are extremely lightweight and consume minimal storage. Using Snapshot copies, SnapMirror can replicate data between repositories faster.
- Seamless failover and failback
In a disaster, FSx for ONTAP seamlessly shifts operations from running on your primary dataset to your DR copy. When the situation is resolved, operations fail back to the primary location.
The data stored in the secondary location is kept up to date asynchronously, helping you meet your DR goals. The data can be recovered from the alternate location in less than 10 minutes, so your RTO stays low.
- Cross-region support
FSx for ONTAP offers fully managed cross-region DR capabilities out of the box. You can use this feature to replicate data to a secondary AWS region, so that the applications are protected from regional outages.
Because the cross-region DR is managed and scalable, setting it up is easy. And because there aren’t any additional hardware or software requirements, it’s also cost effective. With the SnapMirror replication feature, your applications can quickly fail over to the secondary region.
- Highly available
As with all FSx for ONTAP environments, the DR environment is highly available. So, when you’re forced to fail over to the DR environment in a disaster, FSx for ONTAP provides the same level of RPO=0, RTO<60-second business continuity that your primary system provides—or even better.
- Cost saving with storage efficiencies and tiering
ONTAP-based systems support native NetApp storage efficiency features, such as thin provisioning, data compression, deduplication, and compaction. Those efficiencies carry over from your primary environment to the FSx for ONTAP DR copy. That keeps your cloud storage costs optimized and limits the amount of data SnapMirror replicates to the copy, so your transfer costs stay low as well. And, as mentioned earlier, there are no additional costs for replicating data across regions with FSx for ONTAP.
In addition to those storage efficiencies, FSx for ONTAP also supports data tiering. This feature automatically tiers infrequently accessed data to low-cost Amazon S3 storage according to the data’s usage pattern, and when the data is needed again, it’s automatically retrieved to the SSD-based performance tier. And for disaster recovery, you can tier the entire DR copy—which is, by nature, cold data—until the moment it’s needed.
FSx for ONTAP can also create instant, zero-capacity writable clones, which can help you avoid consuming space for DR test copies and development purposes.
These features combine to reduce your overall storage footprint and bring down your storage costs dramatically, both for the primary and the DR copy.
- Automated processes
Just like any other native AWS service, FSx for ONTAP can be managed with AWS CLIs and SDKs, as well as with the NetApp BlueXP™ GUI, API, or automation tools. These tools can help you automate different aspects of the deployment, such as your DR configuration and replications. You can also create scripts to trigger failovers if a disaster occurs.
Case Study: How a Healthcare Service Provider Leverages FSx for ONTAP for Cloud-Based DR
One company that chose FSx for ONTAP for DR is a provider of healthcare services and medical facility management. It uses Epic as its Electronic Health Records (EHR) management solution. This critical data is what they rely on FSx for ONTAP to protect.
The company stores its Epic EHR workloads on non-NetApp storage systems on-premises. Epic recommends its users follow the 3-2-1 data protection strategy: that means keeping three copies of the data across two formats, with at least one copy of the data stored off-site. The company decided to house this off-site copy cost effectively in the cloud.
The main challenge was syncing the DR copy every 15 minutes to minimize data loss in the event of a disaster. Other solutions that were tested out were only capable of backing up the data once every hour. The company chose FSx for ONTAP because it was able to achieve the 15-minute sync period for the DR site.
Additional benefits the company gained with FSx for ONTAP:
Reduced costs thanks to FSx for ONTAP storage efficiencies and tiering capabilities. These inherent features reduce the company’s ongoing expenses for the Epic DR site on AWS. Using data tiering, it’s possible to store the entire DR copy cost-effectively on Amazon S3 object storage until it’s needed.
High performance storage on FSx for ONTAP meets strict performance requirements for Epic workloads. This high-performance capability provides optimal business continuity in the event of a disaster which requires the DR site to operate as the production environment.
Non-disruptive cloning thanks to FlexClone® technology allows the company to perform validation tests on the DR site without disrupting operations. The same data clones can also be used for testing and developing Epic workloads in a cloud environment.
Beyond Disaster Recovery: Other Benefits of FSx for ONTAP
Amazon FSx for NetApp ONTAP offers additional benefits for storage management, cost optimization, and security. These benefits offer secondary advantages for your DR deployment and help you get better returns on your AWS storage investments.
Multi-AZ high availability
FSx for ONTAP is deployed in a highly available configuration across multiple Availability Zones through its multi-AZ architecture. In this configuration, your FSx for ONTAP data is asynchronously replicated to a different Availability Zone in the same region, keeping it protected from any outages that affect the primary Availability Zone.
Writable data clones
FSx for ONTAP supports NetApp FlexClone technology, which gives you the power to create instant, zero-capacity, point-in-time, writable clones of data volumes. These clones are extremely useful in stress-testing your DR system to make sure it can handle a failure.
Here’s how they work: A FlexClone data copy shares data blocks with its parent volume and uses incremental space for the writes. That means it consumes storage only for changes to the copy. Clones can be used for quick testing purposes, where you can create clones of application environments at minimal cost and without affecting the production systems.
Before FSx for ONTAP writes data to the underlying storage, the data is encrypted using keys managed through the AWS Key Management Service (KMS). For data in transit, you can use Kerberos-based in-transit encryption by joining your system to Active Directory.
FSx for ONTAP also supports all the native AWS-based security features, such as network isolation, access control, logging, and auditing. But the same goes for NetApp: You can leverage trusted NetApp security features such as NetApp FPolicy® / Vscan for auditing, file blocking, and antivirus scanning.
FSx for ONTAP is certified for several security standards: PCI DSS, SOC 1, SOC 2, SOC 3, HIPAA, and ISO 9001/27001/27017, to name just a few. Depending on the industry vertical you’re aligned with, it makes the compliance process much simpler.
FSx for ONTAP also supports immutable storage (WORM protection) by using NetApp SnapLock® compliance software, so that data remains static and can’t be tampered with, modified, or deleted for a specified retention period.
To learn more about the FSx for ONTAP data protection and security features, visit our NetApp cyber resilience for AWS page.
It’s Your Data. Here’s How You Keep It Safe.
Disasters are going to happen. What you do to keep your data safe is up to you. FSx for ONTAP can help meet your DR goals and maintain business continuity.
FSx for ONTAP is designed to keep your data protected on AWS. Its trusted NetApp capabilities, such as SnapMirror replication and automatic failover and failback, kick in when you need them most. The solution provides an easier way of achieving your DR requirements, without compromising on storage efficiency or cost.
Whether you’re using an on-premises NetApp ONTAP system or Amazon FSx for NetApp ONTAP natively, DR with FSx for ONTAP will keep you up and running.
One of our experts will be in touch with you shortly.