Cloud Tiering for On-Prem Kubernetes Deployments

June 15, 2020

Topics: Cloud Tiering Data Tiering Advanced 6 minute read Kubernetes

Using Kubernetes to manage your applications and data is considered the standard today, and together with NetApp Trident you can create and manage persistent volumes dynamically on NetApp All Flash FAS appliances and meet the sophisticated persistence demands of your containerized applications in Kubernetes. But if you're storing cold data on Kubernetes persistent volumes, you may not be using your high-performance storage optimally.

For example, Kubernetes volume snapshots with Trident create ONTAP volume snapshots, which, although very efficient, count as cold data and can use up to 10% (on average) of your storage space. How can you reduce this space usage?

NetApp Cloud Tiering service allows persistent volumes to be tiered to a public cloud object storage on AWS, Azure, or Google Cloud. By extending your on-prem systems to the cloud, it cuts costs and increases the available space on your performant AFF or SSD-backed FAS storage systems. And this is done automatically and seamlessly without having to rewrite your applications.

Deploying Kubernetes On-Prem

Kubernetes is the most popular, state-based orchestration engine and a runtime environment for containers. Kubernetes clusters consist of at least one master node to manage the cluster and worker nodes, which are the containers’ runtime environments.

The desired state describes the Kubernetes pods to deploy, the quantity of replicas for each pod, and the resources required. The Kubernetes master, once supplied with the desired state, reviews the available CPU, memory, and storage on each worker node, then determines a deployment plan that can fulfill the desired state and deploys the desired state or fails and produces an error.

During the lifecycle of a container, required storage is created locally on the worker node, and containers can access this storage. Since these containers and their data are both ephemeral, if a container crashes, the container is lost. When Kubernetes recreates the container, that container starts in a clean state, without any of the data that was stored by the crashed container. That can become a serious issue if some of that data was required for compliance purposes or when files sharing is required between containers running within the same pod.

To overcome the ephemeral nature of data, Kubernetes uses the concept of volumes. These volumes are assigned to each pod, allowing data to be shared among containers, and most importantly, to survive container restarts. However, volumes are reclaimed as soon as the pod ceases to exist.

For a volume to outlive its pod, the volume must be provisioned as what is known as a persistent volume, using a persistent volume plugin. Persistent volumes are separate from any pod’s lifecycle and have three possible outcomes: deleted, recycled, or retained until manually deleted. These options are subject to the reclaim policy configuration of each persistent volume.

When installed on-premises, Kubernetes typically requires storage for application data volumes as well as local storage on each node for the Kubernetes worker installation, downloaded container images, and any local container storage volumes. With a Kubernetes cluster having a limit of 300,000 containers across 5000 nodes, storage requirements can increase quickly.

Persistent volumes are either statically provisioned, where each volume is manually created, or dynamically provisioned, where Kubernetes uses a plugin to create the volume. As you can imagine, manual provisioning of Kubernetes volumes every day for a growing environment may be impractical.

Using a persistent volume plugin to dynamically provision volumes is efficient. Still, if you have a NetApp storage appliance running ONTAP, the NetApp Trident plugin, in particular, offers many benefits over and above just the provisioning of volumes.

NetApp Trident for Dynamic Provisioning

The NetApp Trident plugin integrates directly with Kubernetes and can be used to automatically provision iSCSI and NFS persistent volumes on NetApp AFF and FAS systems running ONTAP. Different storage classes can be created where each class defines a different set of storage properties depending on the provisioner.

This is how Trident works. In Kubernetes, when a persistent volume claim is created and no existing persistent volumes can fulfill the request, Kubernetes will try to dynamically provision a volume. It does this by handing the claim over to the provisioner that was defined for the storage class in the claim. In this case, the provisioning task is handed over to Trident, which creates an NFS volume or an iSCSI LUN in the NetApp system and advertises it as a persistent volume in Kubernetes, so that it can be mounted into the pod.

NetApp Trident supports the Container Storage Interface. This interface enables provisioners to communicate with multiple orchestrators, makes storage integration to Kubernetes more seamless, and allows storage vendors to expose features such as volume resize, volume snapshots, and volume cloning for Kubernetes by their provisioner.

Using Kubernetes interface and constructs, it’s possible to create on-demand volume snapshots of a persistent volume in use by a Kubernetes pod. When a request to create a snapshot is made to a persistent volume provisioned by Trident, Trident communicates with ONTAP and uses the NetApp Snapshot™ functions. The new Snapshot copy created of the NetApp volume can be used to create a new persistent volume and bind it to a pod, recover a pod from the point of the Snapshot copy, or create a clone of a pod for testing.

Protecting application data is a critical task, even if it’s a stateful containerized workload. However, Snapshot copies that protect application data stored locally in a persistent volume can’t protect against catastrophic failures of the storage array. Data replication plays an important role in protecting against data loss in such cases. Persistent volumes created by Trident can be replicated to another NetApp system, for disaster recovery, using NetApp MetroCluster or SnapMirror at both the SVM and volume level.

Dynamic provisioning of persistent volumes, volume snapshots, and the ability to easily replicate volumes are excellent features that allow a great deal of flexibility and save much administration. Still, over time, as with any storage in use, cold data stored in old snapshots and secondary copies, which retain cold data by nature, will accumulate. This impacts your available storage capacity on both primary and secondary systems, and reduces the amount of high-performance storage that Kubernetes can use. This is where Cloud Tiering comes in.

How Cloud Tiering Can Help

With Cloud Tiering, any data that is identified as cold moves to cloud object storage on Amazon S3, Azure Blob storage, or Google Cloud Storage, releasing the high-performance hot tier for application use. The requirement for CAPEX to purchase storage to increase hot tier capacity is replaced by OPEX costs for relatively cheap cloud storage.

In production, Cloud Tiering can safely and securely move cold data to cloud storage. Cold data could be snapshot copies of persistent volumes, or data not accessed for 30 days (extendable to 63 days). For example, volumes no longer being used that are still attached to pods, data that has been processed and ingested, or data that has aged could all easily be moved to the cloud where they can be stored cost-effectively.

For volumes containing only backup copies, the entire volumes’ data can be moved to cloud storage, reducing the capacity required for backups, even allowing a higher number or frequency of backups. A secondary data copy stored in a volume for DR purposes can also be moved to the cloud providing economic DR environments with substantially reduced capacity requirements.

Summary

In any application environment, including containerized applications orchestrated by Kubernetes, Cloud Tiering can increase the available capacity and reduce costs in both production environments and backup/DR locations. Cloud Tiering turns future CAPEX into OPEX and reduces the overall cost, while increasing the productive capacity of the high-performance tier to be used for performance demanding applications and not cold storage.

Try Cloud Tiering as part of your Kubernetes deployment today.