Cloud storage snapshots, like AWS snapshots, are point-in-time copies of cloud volumes. Keeping consistent copies of your volumes ensures that if your data on your volume is corrupted or infected or accidentally deleted, it can easily and quickly be restored with an up-to-date copy. However, keeping these copies consistent can become a cost concern, as copies must be created on a frequent basis. Learning how to manage snapshot costs is a crucial step in AWS cost optimization.
In this article, we will look at some of the built-in snapshot technologies offered by AWS and Azure and show you how NetApp’s Cloud Volumes ONTAP Cloud offers an alternative way to cost-effectively create, store, and maintain cloud snapshot data.
Taking snapshots is an easy way to protect your system from data loss, and Azure and AWS both provide the snapshot functionality to do so. Below we’ll look at examples of the AWS and Azure snapshot technologies and Azure and AWS EBS snapshot pricing on each service.
One of the most common snapshot uses in Azure is creating snapshots of page blobs or block blobs. Azure blob snapshots have the same name as the base blob and use a DateTime attribute to distinguish the snapshot capture date. Azure keeps track of unique blocks (in the case of block blobs) or pages (in the case of page blobs) and charges are incremental based on those unique blocks or pages.
How Azure snapshots work: Here’s an example of how a page blob and its snapshots would work on Azure. In this example, the base blob initially has three pages: Page 1, Page 2, and Page 3. A checkpoint is created on Day 1. The number of unique pages in this scenario is three (all the original pages in the block).
Let’s say that on Day 3, there is a modification to Page 2 of the base blob. Now there are four unique pages—the base blob Page 1, Page 2, and Page 3, plus Page 2, the checkpointed version of Page 2.
Now, say on Day 14 the other two pages in the base blob are modified, creating Page 1’ and Page 3’, and also three new pages are added: Page 4, Page 5, and Page 6. So, in total there are now nine unique pages—three original, three changed, and three all-new pages—and the user will be charged for these nine unique pages.
Here is a simple formula for determining the storage costs associated with a storage account hosting page blobs and periodic snapshots:
Total Storage Size (incl. snapshots) in GB = (Total number of unique pages x Size of each
page(512B)) / (1024*1024*1024)
The price of storage can then be determined based on calculations via the Azure pricing calculator. Azure snapshots can also be copied to other blobs, but only blobs which do not have any existing snapshots in them.
The GetPageRanges API can also help copy incremental snapshots in Azure storage. The GetPageRanges API is especially handy in the case of using premium Azure blob storage, because it can reduce the high costs of storing snapshots on Premium disks. Though the GetPageRanges API can also be applied to Standard storage disks, in this section we’ll look at how it works in Premium storage.
The snapshots can be copied over to a destination blob in a standard account and incremental changes in the snapshots can be copied thereafter. This drastically reduces snapshot charges and acts as an effective backup mechanism where the snapshots can be restored over the base blobs.
If the snapshots are copied to a different storage account, the size of the snapshots can be determined by the following formula (also applies to Standard storage):
Total Snapshot Size = Initial Disk Storage Size + (incremental changes) * retention period
Let’s look at an example of how it would work. Company ABC has a Ds5 v2 Windows VM hosting their in-house data analytics application. This resource intensive VM has a Premium Unmanaged Data Disk of size P50 (4 TB) that hosts the data for the application. The average daily incremental change of pages is .04 TB with a retention period of 30 days. A snapshot is taken every day. The initial snapshot is copied to another unmanaged standard disk using Snapshot Blob. Incremental changes are applied using the GetPageRanges API and PutPage. Also, let’s imagine that the base page blob starts off with 2 TB of data.
Total Snapshot size = 2 TB + (.04*30) = 3.2 TB
This would amount to a total charge of roughly $148 USD, based on the pricing calculator for the East US Region. Please note this does not take storage transactions into account and the disks (both Standard and Premium) considered here are of LRS redundancy.
At the time of writing of this article incremental snapshots of managed disks are not supported, so every snapshot of a managed disk is a full copy of the managed disk. That means every time a snapshot is taken, it would be a full copy of the source managed disk (including the incremental changes). One thing to note is that when snapshots of managed disks are taken, the total occupied space is considered for billing, not the provisioned size.
Amazon EBS snapshots are created on an incremental basis. When a new snapshot is created, it only copies the block changes from the last snapshot and users are charged according to those changes. For example, if the base storage size was 500 GB and only 3 GB of data has been modified since the last snapshot, then a new snapshot will only consume 3 GB and you are charged for a 3 GB snapshot.
AWS EBS snapshot pricing: The three main parameters that define your AWS EBS snapshot costs for storage are baseline storage size, data growth rate, and the retention period of the snapshots. Here is a simple formula for calculating snapshot storage costs for an Amazon EBS volume hosting a database based on the above parameters:
Total snapshot size = Disk storage size (one-time full copy/migration) + (incremental (changes) in the DB x retention.
Company XYZ has an Amazon EBS volume that hosts SQL Databases and Amazon EC2 instances for their LOB application with an initial data size of 2 TB. As it is an important volume, the RPO is set at 12 hours which dictates that snapshots of the volume is created twice a day. Let’s imagine that the volume has an average daily incremental change of .04 TB with a defined retention period of 30 days. So in this case:
Total Snapshot Size = 2 TB + (.04 TB * 30) = 3.2 TB
If the snapshots are stored in Amazon S3 Storage class in the US East/Virginia AWS Region, the price amounts to $75.25 USD. This can easily move into thousands of dollars as the size of the volume increases and if there are multiple volumes or larger retention periods.
Note: The prices are based on calculations made via the AWS Calculator and are accurate for the above mentioned region at the time of writing of this article.
Here are some important things to remember about Amazon EBS snapshots:
With all of the functionality of the AWS snapshots and Azure snapshots, there is still the matter of the costs involved. As an alternative, Cloud Volumes ONTAP users can create application-aware snapshots that have no performance impact and consume minimal storage space at no additional cost. Cloud Volumes ONTAP snapshots are based on NetApp Snapshots® technology. They are created in a matter of seconds (typically less than one second), irrespective of the size of the volume that is being copied. This is because instead of copying all the data in your system, they only copy any data that has changed by manipulating block pointers. Cloud Volumes ONTAP snapshots are thus extremely efficient in terms of both space and creation speed, more so than any other kind of public cloud snapshot.
In addition to the benefits of Cloud Volumes ONTAP’s snapshots, the Cloud Volumes ONTAP storage efficiency features of thin provisioning, data deduplication, and data compression ensure that you can store your data as economically as possible. And data tiering creates additional savings by allowing infrequently-accessed data such as snapshots to be kept in inexpensive Amazon S3 storage, while still being accessible for use in Amazon EBS should restoration be necessary. Together these benefits typically save a company as much as 50% to 70% on public cloud storage costs.