Kubernetes Persistent Storage offers Kubernetes applications a convenient way to request, and consume, storage resources. A Volume is a basic building block of the Kubernetes storage architecture. Kubernetes Persistent Volumes are a type of Volume that lives within the Kubernetes cluster, and can outlive other Kubernetes pods to retain data for long periods of time.
Other central Kubernetes storage concepts include Persistent Volume Claims, which are requests by Kubernetes nodes for storage resources, and Storage Classes, which define types of storage, allowing Kubernetes resources to access Kubernetes storage solutions without knowing their underlying implementation.
In this post, we’ll review these concepts, explain Kubernetes storage integrations and features, and show how NetApp Cloud Volumes ONTAP can help provision highly available, high performance storage for Kubernetes applications.
This is part of an extensive series of guides about microservices.
In this article you will learn:
Containers are immutable, meaning that when a container shuts down, all data created during its lifetime is lost. This is suitable for some applications, but in many cases, applications need to preserve state or share information with other applications. A common example is applications that rely on databases (see our post on MySQL Kubernetes). For these and other use cases, there is a need for containers to have a place to store information persistently—so it can survive the shutdown of one or more containers.
Kubernetes provides a convenient persistent storage mechanism for containers. It is based on the concept of a Persistent Volume (PV). Kubernetes Volumes are constructs that allow you to mount a storage unit, such as a file system folder or a cloud storage bucket, to a Kubernetes node and also share information between nodes. Regular Volumes are deleted when the Pod hosting them shuts down. But a Persistent Volume is hosted in its own Pod and can remain alive for as long as necessary for ongoing operations.
Kubernetes persistent volumes (PVs) are a unit of storage provided by an administrator as part of a Kubernetes cluster. Just as a node is a compute resource used by the cluster, a PV is a storage resource.
Persistent volumes are independent of the lifecycle of the pod that uses it, meaning that even if the pod shuts down, the data in the volume is not erased. They are defined by an API object, which captures the implementation details of storage such as NFS file shares, or specific cloud storage systems.
Kubernetes persistent volumes are administrator-provided volumes. They have predefined properties including file system, size, and identifiers like volume ID and name.
In order for a Pod to start using these volumes, it must request a volume by issuing a persistent volume claim (PVC). PVCs describe the storage capacity and characteristics a pod requires, and the cluster attempts to match the request and provision the desired persistent volume.
There are two related concepts you should understand as you start working with Kubernetes persistent volumes:
PersistentVolumeClaim (PVC)
This is a request sent by a Kubernetes node for storage. The claim can include specific storage parameters required by the application—for example an amount of storage, or a specific type of access (read/write, read-only, etc.).
Kubernetes looks for a PV that meets the criteria defined in the user’s PVC, and if there is one, it matches claim to PV. This is called binding. You can also configure the cluster to dynamically provision a PV for a claim.
StorageClass
The StorageClass object allows cluster administrators to define PVs with different properties, like performance, size or access parameters. It lets you expose persistent storage to users while abstracting the details of storage implementation. There are many predefined StorageClasses in Kubernetes (see the following section), or you can create your own.
Administrators can define several StorageClasses that give users multiple options for performance. For example, one can be on a fast SSD drive but with limited capacity, and one on a slower storage service which provides high capacity.
A PV is a cluster resource and a PVC is a request for a PV resource. The interaction between PVs and PVCs follows a distinct lifecycle, starting with provisioning and including binding, using, and reclaiming.
Here are the two main types of provisioning:
Here is a quick rundown of how the binding process works:
A PVC to PV binding is a one-to-one mapping. The process uses a ClaimRef, which creates a bi-directional binding between the PV and the PVC.
Pods use claims as volumes. Here is how the using process works:
It is also possible to schedule Pods. Users can schedule Pods and access a claimed PV by including a PersistentVolumeClaim section in the volumes block of the Pod.
A reclaim policy for PVs tells the cluster what to do with the volume after the claim is released, and volumes can be Retained, Recycled, or Deleted. Once users do not need their volume anymore, they can delete the PVC objects from the API that allows reclamation of the resource.
Kubernetes comes with numerous plugins that let you make different types of storage resources available to nodes in the Kubernetes cluster.
These are plugins shipped together with the Kubernetes distribution, which are implemented using the StorageClass object. Here are some of the main plugins currently supported:
Cloud Storage and Virtualization |
Proprietary Storage Platforms |
Physical Drives / Storage Protocols |
GCEPersistentDisk |
Flocker |
NFS |
AWSElasticBlockStore |
RBD (Ceph Block Device) |
iSCSI |
AzureFile |
Cinder (OpenStack block storage) |
FC (Fibre Channel) |
AzureDisk |
Glusterfs |
|
VsphereVolume |
Flexvolume |
|
|
Quobyte Volumes |
|
|
Portworx Volumes |
|
|
ScaleIO Volumes |
|
|
StorageOS |
|
For more details on these plugins, see the StorageClass documentation.
Read our blog post on the Kubernetes NFS integration.
Until recently, it was challenging to develop new storage volume plugins for Kubernetes. All volume plugins were “in tree”, meaning they were shipped together with the Kubernetes distribution, and vendors creating plugins had to align with the Kubernetes release process.
In 2019 Kubernetes adopted the Container Storage Interface (CSI), which made Kubernetes volumes extensible. Any storage equipment developer can easily write a CSI plugin exposing their storage system, without having to touch the core Kubernetes code. Here is the full list of CSI drivers available for use with Kubernetes.
Read our blog post on Container Storage Interface (CSI).
Kubernetes Persistent Volumes offer powerful capabilities. The most important are detailed below.
Capacity |
The capacity attribute lets you set the maximum storage capacity of the PV. Storage is specified in bytes, to ensure quantities are standard across all storage services and devices. |
Volume Mode |
By default, Kubernetes creates a file system on the PV, but if desired, you can use a raw block device directly without an additional layer. |
Access Modes
|
A PV can have the following access modes: ● ReadWriteOnce—enables read and write and can be mounted by only one node ● ReadOnlyMany—enables read only and can be mounted by multiple nodes simultaneously ● ReadWriteMany—both read and write, can be mounted by several nodes simultaneously ● ReadWriteOncePod (alpha)—volume can be mounted as read-write by a single pod
Note: Different storage plugins may only support some of these access modes. |
Reclaim Policy |
The reclaim policy specifies what happens when the node no longer needs the persistent storage. It can be set to Retain, meaning the PV is kept alive until it is explicitly deleted; Recycle, meaning the data is scrubbed but can be restored later; and Delete, meaning it is irreversibly deleted.
Note: Different storage plugins may only support some of these reclamation policies. |
Phase |
A PV goes through the following lifecycle phases, which are visible to other entities in the cluster: ● Available—free for use, binding has not occured yet ● Bound—the PV was matched to a PersistentVolumeClaim and binding has occurred ● Released—the user deleted their PVC, but the PV is not yet reclaimed by the cluster ● Failed—the PV could not be reclaimed by the cluster automatically |
Kubernetes has revolutionized application development, deployment, and scaling. However, it doesn’t support container data storage, so you need to deploy external mechanisms to make data available when you restart a cluster.
The most popular option is a cloud storage solution that supports containerized applications. Cloud native storage offerings can reproduce the conditions of a cloud environment, enabling scalability, high availability, and container-based architecture. They integrate with Kubernetes to offer persistent storage.
A major benefit of Kubernetes is the support it has from the major cloud providers and does not impose vendor lock-in. You can manage clusters in multi-cloud deployments using services from different vendors. However, the external storage solution must also support portability and integration with your existing monitoring tools. It must also offer high availability and performance, with the ability to scale according to dynamic system demands. An effective solution should also enable fast recovery in the event of data loss.
NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload.
In particular, Cloud Volumes ONTAP provides Kubernetes integration for persistent storage requirements of containerized workloads.
A Kubernetes volume is a directory containing data, which can be accessed by containers in a Kubernetes pod. Understand the main types of Kubernetes volumes, including persistent volumes and ephemeral volumes, learn about volumeMounts, deploying volumes, and more.
Read more: 5 Types of Kubernetes Volumes and How to Work with Them
This article is a side-by-side comparison of two technologies used to deploy Kubernetes persistent volumes for stateful applications: the open-source OpenEBS and NetApp’s Cloud Volumes ONTAP.
What are the tradeoffs between the two technologies, and which is right for enterprise-level Kubernetes deployments?
Read more: Storage Abstraction on Kubernetes: OpenEBS Vs. Cloud Volumes ONTAP.
As the backend storage management system for Kubernetes clusters, Cloud Volumes ONTAP provides a number of advanced features. This blog post gives you an overview of specific Cloud Volumes ONTAP features for Kubernetes-based stateful applications and takes a look at how enterprises can benefit from them.
Read more in Advanced Features on K8s with Cloud Volumes ONTAP: Scaling, Monitoring, and More.
Together with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of microservices.
Authored by CodeSee
Authored by NetApp
Authored by NetApp