July 23, 2019
Topics: Cloud Tiering Data TieringAWSAdvanced5 minute read
Exponential data growth is one of the major challenges faced by every organization today. Capacity management of on-premises storage becomes a bottleneck, as purchasing and hosting additional storage devices every year is not a scalable solution for their storage needs.
Because of this, organizations often wind up compromising in one of two ways: sacrificing performance by selecting low-cost storage systems or paying for costly storage arrays even though they’ll store large amounts of infrequently-accessed, cold data. Another strategy is to opt to manually migrate the cold data to cheaper storage, which again involves huge administrative overheads as well as complex application reconfigurations.
The Cloud Tiering feature available for NetApp AFF storage system users helps address these concerns by allowing customers to tier inactive data into low-cost Amazon S3 object storage in AWS. There is no need to refactor or reconfigure applications to access data, which lets you avoid the high administrative overheads and disruptions to your processes. It also helps customers optimize their AFF investments with reduction of up to 40% of TCO. This blog will explore the cost and performance considerations while tiering data from on-premises AFF storage to Amazon S3.
Cloud Tiering to Amazon S3
NetApp AFF systems are high-performance, all-Flash storage arrays powered by NetApp ONTAP® data management software. With Cloud Tiering, ONTAP’s industry-leading data services now extend from the data center to hybrid cloud architectures. Cloud Tiering uses NetApp FabricPool technology to tier data that is not frequently accessed to object storage in the cloud, such as Amazon S3. This helps customers efficiently use the high-performance on-premises storage array for frequently used data. It offers an optimal balance of cost and performance, where you need it the most.
There are two policies available for cloud tiering:
Tier cold data: Data that has remained inactive for 30 days or for a user-defined period is moved to cloud storage based on the policy configured. This policy is recommended if you expect your data to become stale after a while, but if you still need to retain it for future reference.
Tier snapshots: Though snapshots are an efficient approach for backup and disaster recovery, storing snapshots in high performance all-flash storage can affect storage economy. Snapshot tiering policies help to address this concern by tiering snapshots to cheaper Amazon S3 object storage after a short cooling period of two days.
Some of the benefits of tiering data to object storage in cloud are as follows:
- Easy entry point to cloud for organizations exploring the hybrid cloud architecture.
- No application layer changes required: tiering happens transparently at the storage layer.
- Customers can maintain existing workflows and processes while leveraging the familiar ONTAP tools and technology.
- Enables best use of your AFF storage capacity, as it can be repurposed for hot data automatically.
Cost Benefits of Amazon S3 for AFF Users
Amazon Simple Storage Service or Amazon S3 is one of the most preferred object storage services available in the public cloud today, with a distinct competitive edge when it comes to availability, security, and performance of the service. Though used for other use cases such as hosting website and mobile applications, it is primarily intended as a low-cost data storage option often preferred by customers for storing backups and archival data. NetApp AFF supports Amazon S3 standard storage class and Standard-Infrequent Access storage class. Customers can choose from one of these options while configuring cloud tiering in NetApp AFF. If Standard-Infrequent Access storage class is used, the inactive data is initially moved to the Standard storage class and then tiered to Standard-infrequent Access after 30 days.
In many organizations there can be terabytes of inactive data in the storage system that could benefit from the economy of Amazon S3 object storage. Consider a scenario where 10 TB of data in your storage system is inactive or is accessed very rarely. While using a Raid 6 configuration with disks of 3.8 TB capacity each, you would need a minimum of five disks to store this data. When the percentage of infrequently-accessed data increases over the years—as it inevitably will—you will have to add more disks to your storage array or even purchase new storage arrays to meet the capacity demands. This would result in huge upfront investments in storage hardware annually.
If the same 10 TB of data is tiered to Amazon S3 Standard-Infrequent Access storage class, the annual cost would come to only $1,665, even if we assume that 10% of the data will be accessed and eventually tiered back to the on-premises tier on AFF. The cost will be incurred on a pay-as-you-go basis with monthly charges as low as $139, thereby eliminating the need of large upfront investments.
Amazon S3 Performance Considerations
NetApp AFF systems offer best-in-class performance with an all-flash architecture with front-end NVMe/FC host connectivity along with back-end NVMe-attached SSDs that help achieve latencies as low as 100µs. NetApp FabricPool technology used in NetApp AFF helps to achieve optimal performance while reducing overall storage costs by automatically tiering infrequently-accessed data to Amazon S3 storage classes.
FabricPool allows you to attach one Amazon S3 bucket per aggregate. While you can attach the same bucket to multiple aggregates, it could impact the performance of the capacity tier due to possibility of IOPS bottle necks when it is being accessed by multiple aggregates. Another factor that could impact the performance is the capacity of the performance tier. The performance tier should be planned with enough buffer to hold the cold data until it is tiered to cloud. It should also have capacity to hold the data that will be tiered back from capacity tier in the cloud when it is accessed. If the performance tier capacity exceeds 70%, data will not be written back, and the blocks will be accessed directly from capacity tier in the cloud. This would also result in performance degradation for users as the data access speed would depend on external factors, such as network latency.
Cloud Tiering helps to address the data growth concerns of organizations to a great extent, while keeping the cost factor under control. It is available in BYOL (bring your own license) and PAYGO licensing models. In BYOL, customer pays for a specific storage capacity for a time period, while in PAYGO model customers only pay for the data being tiered. Cloud Tiering also supports the “Lift and Shift” strategy by offering a zero-effort data center extension to the cloud. It could help you save 20X space with 79% AFF Capacity savings.