hamburger icon close icon

Cloud Tiering with NetApp StorageGRID

September 22, 2020

Topics: Cloud Tiering Advanced6 minute read

Your high-performance NetApp AFF or SSD-based FAS is running out of capacity, and a chunk of the used capacity is cold data, which could be recovered by tiering your storage to the public cloud.

But what if company policy or regulatory requirements won't allow your data to go off-prem? Or what if you have large data files which may be costly when they are accessed and migrated back? 

In that case, a private cloud storage solution such as NetApp StorageGRID would be most suitable. Using StorageGRID keeps your data in your data center, therefore more secure, with no IOPS limits or charges for accessing your data.

Now, you can tier your performant storage to NetApp StorageGRID and enjoy the benefits of Cloud Tiering on a private cloud.

What Is StorageGRID?

StorageGRID is a software-defined clustered object storage system, that can be deployed on engineered systems or commodity hardware as well as on virtual machines. It can reside in a single site or span across multiple sites to provide geo-redundancy enhancing data durability and availability.

StorageGRID provides multi-tenancy and natively supports the standard Amazon S3 for object storage access. As an S3 object storage solution, StorageGRID is used as a massive repository for unstructured data serving many use cases such as enterprise backup and archive, enterprise content management, media asset management, HPC, analytics, IoT and more.

Diagram

A StorageGRID cluster consists of several nodes, with a minimum of one admin node, and three storage nodes. A single StorageGRID cluster can scale up and out to fulfill growth requirements by adding more capacity to existing storage nodes or by adding more storage nodes. Multiple admin nodes and gateway nodes can be added as well, to facilitate resilience and load balancing needs.

Altogether there are three node types that can exist in a cluster:

  • Admin Node: Management and Monitoring of the storage cluster. Can be used to load balance S3 client traffic.
  • Storage Node: Object and metadata storage.
  • Gateway Node (optional): Load balance client connections to the optimal storage nodes at the optimal site, making storage node failures and even site failures transparent. With traffic classification policies, network traffic can be identified for metrics reporting or QoS where traffic can limited by bucket, tenant, IP subnet or by load balancer port.

StorageGRID family of hardware appliances combine the StorageGRID Node computing and storage elements in a single, efficient, integrated solution. In addition, the installation and configuration of a StorageGRID appliance-based node is simplified and automated through an installation user interface removing the need for manual scripting and execution of files. Clusters can be a hybrid of virtual and physical hosts, depending on requirements.

The StorageGRID SG1000 services appliance node can operate as an Admin or Gateway node. 40 cores of CPU and four ports of either 10, 25, or 100 Gbit Ethernet, make it ideal for administrative workloads and as a high throughput load balancer.

The storage node types have two performance options:

  • The SG5700 appliance comes with a compute controller consisting of 8 CPU cores and 10 or 25 Gbit Ethernet ports. The compute controller is attached to the storage controller in a 2U chassis with 12 NL-SAS drives making the SG5712 or in a 4U chassis with 60 NL-SAS drives making the SG5760.
  • The SG6000 appliance comes with a compute node consisting of 40 CPU cores and either four 10 or 25 Gbit Ethernet ports in a 1U chassis. The compute node is attached to a 2U chassis with 24 SSDs and redundant storage controllers making the SGF6024 appliance, or a 4U chassis with 58 NL-SAS, 2 SSDs, and redundant storage controllers making the SG6060 with the possibility of adding expansion shelves post or pre install up to 178 NL-SAS drives.

The following figure summarizes the main hardware attributes of the StorageGRID appliances:

SG Types

For software-based nodes, the choices are either prebuilt VM's with StorageGRID installed, supplied as OVA for import into VMware. Another option is to use Docker containers on Linux, provided as rpm or deb files, containing documentation, template configuration files, and the Docker images. Supported Linux distributions include:

  • Red Hat Enterprise Linux
  • CentOS
  • Debian
  • Ubuntu

Tiering to the GRID: The Benefits of Cloud Tiering

The primary concern that might challenge any successful on-prem deployment is optimizing the use of the physical devices, and limiting the amount of new disk shelves and nodes that need to be procured if the system needs to scale up and out. With the Cloud Tiering service, NetApp has added a powerful way for NetApp AFF and SSD-backed FAS users to address those issues.

Cloud Tiering automatically detects infrequently used data on the on-prem ONTAP system and has the ability to seamlessly tier that data to a public cloud object storage or locally to StorageGRID. This makes it possible to reclaim space on high performance devices and expand their available space with a highly durable cheaper storage tier.

Let’s take a look at some of the other advantages of tiering to StorageGRID.

Security and Privacy

Because data is tiered from an on-premises high-performance NetApp appliance to a NetApp StorageGRID that is also on-prem, your data never leaves your infrastructure. To ensure privacy and security, NetApp recommends using ONTAP NVE/NAE for data encryption at-rest and an in-house Certificate Authority and using TLS to protect data in-flight.

Performance

There are no limits on IOPS with StorageGRID and performance scales with the number of nodes in the system, meaning tiered data moves as fast as your infrastructure allows. When tiered data is accessed, it becomes hot again and migrates back to the performance tier, the latency perceived at the client can be considerably less when compared to latency in the public clouds.

Efficiency

Although StorageGRID has configurable compression for stored objects, it’s not necessary in this case. Data blocks tiered from NetApp AFF or SSD-backed FAS are from volumes which would typically already be compressed, deduplicated, and compacted. These efficiencies are preserved when data is tiered, so only a minimal StorageGrid footprint is consumed.

Data Protection

To protect tiered data from failures, Cloud Tiering supports the StorageGRID Information Lifecycle Management (ILM) policies for data replication and erasure coding. ILM policies can dictate where data is stored and how it is protected. StorageGRID uses two-copy replication as the default ILM rule for data protection. Erasure coding is another data protection scheme used in StorageGRID deployments and is the recommended best practice for achieving cost-efficient data protection. Erasure coding is specified using the N+M format, where the data split into N blocks, and M parity blocks are calculated. Under erasure coding, if M chunks are lost, data is still recoverable.

For best practices on StorageGRID ILM settings with Cloud Tiering, read section 6.1 in FabricPool Best Practices Technical Report.

Conclusion

Data center storage is still in high demand due to its superior performance, but not all the data you have needs to be stored on a performant system and some data is not allowed to leave the premises. In this post we introduced how StorageGRID clusters and features can be used for freeing up NetApp AFF and SSD-backed FAS device capacity with NetApp Cloud Tiering Service.  

Cloud Tiering to StorageGRID also helps keep data more secure as the data always stays on-prem, and more performant as your data only moves between on-prem infrastructure and networks in your data center.

New call-to-action

Oded Berman, Cloud Evangelist

Cloud Evangelist

-