Subscribe to our blog
Thanks for subscribing to the blog.
In the IT world, you’ll often need extra copies of a dataset—for example, when doing application development and testing (DevTest) and when provisioning new virtual machines. However, creating those copies is easier said than done.
There are some major challenges associated with data cloning, especially when you’re dealing with large datasets. Creating copies of the relevant data can be time consuming and can lead to storage sprawl, increased costs, slower performance, and process overhead. All of that will increase your overall time to market, which will affect your bottom line.
Your development can’t wait that long. Amazon FSx for NetApp ONTAP offers built-in data management capabilities delivered as a first-party AWS service, and it can help overcome these challenges.
Read on for more, or jump down using these links:
- Why copying your dataset matters (and what makes it hard to do)
- There’s a more efficient way to clone data with FSx for ONTAP
- How a major games-as-a-service provider accelerates development using FSx for ONTAP
- The bottom line: Faster development, lower costs
Why copying your dataset matters (and what makes it hard to do)
Everyone knows that data is one of the most important assets that an organization can have. But how that data gets used is what makes the difference. Given the importance of data, it’s not something you want to tamper with. To make proper use of your data, you need a “golden copy”—an identical version of your dataset that serves as a testbed environment that you can re-create repeatedly. A golden copy keeps the primary dataset safe from your tests, and you can put the copy through testing without affecting production.
The two biggest areas where such copies come into play are in the development pipeline and in creating new environments.
When it comes to DevTest, an important metric is how many tests you can run against a code base per hour. The more tests run, the higher the agility, so the code base progresses faster. Some tests require hundreds of runs, involving hundreds of copies.
Data copies are also used extensively in disaster recovery (DR) environment testing, which involves using copies of data to restore application services outside your primary data location. Other popular use cases for data copies include database refreshing, exploratory data analysis, high-performance computing for media and entertainment (M&E), analytics, and AI.
However, creating copies for these purposes can be challenging for a number of reasons.
- Copying data takes time. To create a version of the data that you can safely test, you’ll need to create a copy of the golden copy. Traditionally, this is a process that takes a lot of time. Depending on the size of your dataset, creating the copies you need can chew into most of the test run time. That limits the number of tests per hour, which in turn delays your release.
- Rapidly increased storage usage and costs. Because data copies fully duplicate the original dataset, each copy doubles your storage consumption and adds to your compute and network resources. The DevTest process can require many (sometimes hundreds) of such copies to be created, ballooning your costs. Plus, your developers and administrators will sink a lot of valuable administrative time and energy into handling these copies.
- Delayed time to market. Pushing new releases is the way apps stay agile and competitive. You can’t do that if your release schedule is getting bogged down by overly long and complex copy mechanisms.
- Performance issues. Accessing and updating data copies in multiuser or multiapplication contexts might result in resource contention, leading to performance issues.
- Operational overhead. Managing multiple data copies can be complicated and error prone. It requires meticulous planning to ensure that the clones are consistent and up to date, which adds to the operational overhead.
The challenges to working with data copies are considerable, but NetApp and AWS have partnered to deliver a solution for writable thin-clone copies: Amazon FSx for NetApp ONTAP.
There’s a more efficient way to clone data with FSx for ONTAP
Amazon FSx for NetApp ONTAP has a built-in data cloning capability that’s delivered by NetApp® FlexClone® technology. This capability lets you create instantaneous point-in-time local copies of your data volumes—copies that are writable and consume minimal storage space.
These “thin” clones make it much faster and less expensive to build your test environments, refresh your databases, and much more.
How FSx for ONTAP cloning works
FSx for ONTAP uses FlexClone technology to create highly space-efficient, writable copies. Here’s how it works:
- You can instantly create local writable copies of volumes, LUNs, and files. The instantaneous data volume copies created by FSx for ONTAP leverage a virtual layer on top of an existing NetApp Snapshot™ copy. That Snapshot copy acts as a golden copy and requires very little metadata. The clone copies are created independently from the master copy, making the cloning process extremely space efficient.
- Clones are updated independently from parent volumes. A clone copy shares all the same blocks as its parent, and additional storage space is consumed only when there’s a change in the data. The data change is updated in increments of 4K blocks.
The clones thus have no performance impact on applications that use the production data volumes. If required, you can also split the clones from their master copy and use them independently—but this would require additional disk space.
When you clone DR volumes that correspond to your production environment, for testing or other purposes, the SnapMirror® feature works continuously to replicate data to the clones’ parent volumes while your DevTest team works on the clones.
- Clones are space efficient, which lowers costs. Consider the example of DevTest for a 100GB production database. Normally, that requires a full mirror and then many copies for developers and testers to use. If we assume that three of each type are required, the total storage required is 800GB including that of the production database.
Even if a full mirror copy of the data is maintained to avoid affecting production storage, using FlexClone for DevTest copies reduces the storage consumption to 260GB. That reduces the overall amount of storage required by 67% and cuts costs proportionally. Learn more about how to determine the space used by a FlexClone volume.
- Clones have low performance overheads. Because clones have almost zero impact on storage, you don’t have to worry about refreshing them frequently with updated production data. That means you’re always able to test against current data, rather than stale data.
Clones also allow you to carry out testing without affecting the production environment. When the testing is finished, you simply delete the clone and create a new, clean clone image in a matter of seconds.
You can also use APIs to automate the cloning process and integrate it with your CI/CD (continuous integration and continuous deployment) pipeline. This approach avoids the DevTest cloning challenges discussed earlier.
Using the FlexClone capabilities of FSx for ONTAP results in significant improvements in development and test capabilities. If you’re able to test faster, you’re releasing builds faster.
How development pipelines benefit from data cloning with FSx for ONTAP
Let’s take a look at some of the things that you can achieve with FSx for ONTAP data cloning.
Clone copies play crucial roles in app development and DR.
- Faster time to market with instantly created development environments. With the FlexClone capability, copies of production environments are created instantaneously. Developers who use FlexClone spend less time waiting for copies and more time working, because clones are created and cleaned up quickly. This in turn leads to more agility, better productivity from the development team, and faster time to market.
- Cost savings. Because thin clones consume minimal storage space, they don’t incur much extra costs in AWS.
- Quick environment refresh. Because FSx for ONTAP creates data clones instantaneously, you can refresh the DevTest environment with the production environment data whenever it’s required. That refresh speed lets you test more frequently—and with the most up-to-date data.
- Zero-impact testing. FlexClone allows you to carry out tests without jeopardizing your production environment or your primary dataset. When testing is done, you can simply remove the clone and produce a new clone in seconds. That capability reduces overhead and speeds up the development process.
How a major games-as-a-service provider accelerates development using FSx for ONTAP cloning
This game developer and games-as-a-service provider puts out some of today’s most popular gaming titles, with hundreds of millions of players worldwide connected through in-house networks. Migrating to FSx for ONTAP has had a major effect on how this company makes that all happen.
The game company was looking to speed up the development cycle in their build-farm operations in AWS. Because the game product is live, it requires constant short releases. AWS offered access to more compute power and scalability, doubling the number of daily builds achieved. With FSx for ONTAP as the storage layer, the company was able to do even more:
- Reduced the transfer time for source code to new instances from hours to minutes. Previously, work on the code had to stop while data copies were created, and that slowed down the entire CI/CD process. With thin cloning with FSx for ONTAP, new copies could be created instantaneously and then easily shared.
- Cut down storage costs for the massive code-base testing. There are hundreds of instances running parallel tests in development. FlexClone technology creates zero-capacity-cost data clones instead of copying entire volumes of the data for each test copy and storing them at full cost. The resulting savings are significant.
- Eliminated downtime potential. With the multiple Availability Zone (multi-AZ) high availability built into FSx for ONTAP, the data exists in two nodes that are kept in sync across two separate AZs. Even if an outage occurs in one AZ, the build process can continue without interruption, because the developers can still access the data stored on the FSx for ONTAP node in the unaffected AZ.
The bottom line: Faster development, lower costs
The demands of the development cycle on the storage layer can lead to high costs and schedule delays. With Amazon FSx for NetApp ONTAP thin cloning, you not only get instant, performance-neutral clones—you also avoid paying for extra storage capacity as you create copies.
Don’t let your data slow you down. Let the thin-clone capability of FSx for ONTAP save you time and money.
Schedule a demo with one of our experts
One of our experts will be in touch with you shortly.