Kubernetes Persistent Volumes and the PV Lifecycle

Written by Yifat Perry, Technical Content Manager | May 30, 2022 6:04:54 AM

Kubernetes Persistent Storage offers Kubernetes applications a convenient way to request, and consume, storage resources. A Volume is a basic building block of the Kubernetes storage architecture. Kubernetes Persistent Volumes are a type of Volume that lives within the Kubernetes cluster, and can outlive other Kubernetes pods to retain data for long periods of time.

Other central Kubernetes storage concepts include Persistent Volume Claims, which are requests by Kubernetes nodes for storage resources, and Storage Classes, which define types of storage, allowing Kubernetes resources to access Kubernetes storage solutions without knowing their underlying implementation.

In this post, we’ll review these concepts, explain Kubernetes storage integrations and features, and show how NetApp Cloud Volumes ONTAP can help provision highly available, high performance storage for Kubernetes applications.

This is part of an extensive series of guides about microservices.

In this article you will learn:

What is Kubernetes Persistent Storage?
What are Kubernetes Persistent Volumes (PV)?
Persistent Volume Life Cycle
- Provisioning
- Static
- Dynamic
- Binding
- Using
- Reclaiming
Types of Persistent Volumes (PV)
Persistent Volumes Features
Choosing Kubernetes Storage Solutions
Kubernetes Storage with Cloud Volumes ONTAP

What is Kubernetes Persistent Storage?

Containers are immutable, meaning that when a container shuts down, all data created during its lifetime is lost. This is suitable for some applications, but in many cases, applications need to preserve state or share information with other applications. A common example is applications that rely on databases (see our post on MySQL Kubernetes). For these and other use cases, there is a need for containers to have a place to store information persistently—so it can survive the shutdown of one or more containers.

Kubernetes provides a convenient persistent storage mechanism for containers. It is based on the concept of a Persistent Volume (PV). Kubernetes Volumes are constructs that allow you to mount a storage unit, such as a file system folder or a cloud storage bucket, to a Kubernetes node and also share information between nodes. Regular Volumes are deleted when the Pod hosting them shuts down. But a Persistent Volume is hosted in its own Pod and can remain alive for as long as necessary for ongoing operations.

What are Kubernetes Persistent Volumes (PV)?

Kubernetes persistent volumes (PVs) are a unit of storage provided by an administrator as part of a Kubernetes cluster. Just as a node is a compute resource used by the cluster, a PV is a storage resource.

Persistent volumes are independent of the lifecycle of the pod that uses it, meaning that even if the pod shuts down, the data in the volume is not erased. They are defined by an API object, which captures the implementation details of storage such as NFS file shares, or specific cloud storage systems.

Kubernetes persistent volumes are administrator-provided volumes. They have predefined properties including file system, size, and identifiers like volume ID and name.

In order for a Pod to start using these volumes, it must request a volume by issuing a persistent volume claim (PVC). PVCs describe the storage capacity and characteristics a pod requires, and the cluster attempts to match the request and provision the desired persistent volume.

There are two related concepts you should understand as you start working with Kubernetes persistent volumes:

PersistentVolumeClaim (PVC)

This is a request sent by a Kubernetes node for storage. The claim can include specific storage parameters required by the application—for example an amount of storage, or a specific type of access (read/write, read-only, etc.).

Kubernetes looks for a PV that meets the criteria defined in the user’s PVC, and if there is one, it matches claim to PV. This is called binding. You can also configure the cluster to dynamically provision a PV for a claim.

StorageClass

The StorageClass object allows cluster administrators to define PVs with different properties, like performance, size or access parameters. It lets you expose persistent storage to users while abstracting the details of storage implementation. There are many predefined StorageClasses in Kubernetes (see the following section), or you can create your own.

Administrators can define several StorageClasses that give users multiple options for performance. For example, one can be on a fast SSD drive but with limited capacity, and one on a slower storage service which provides high capacity.

Persistent Volume Life Cycle

A PV is a cluster resource and a PVC is a request for a PV resource. The interaction between PVs and PVCs follows a distinct lifecycle, starting with provisioning and including binding, using, and reclaiming.

Provisioning

Here are the two main types of provisioning:

Static - PVs created by a cluster administrator. These PVs are located in the Kubernetes API and contain information about storage resources available to cluster users.
Dynamic - to meet PVC demands, clusters can attempt to dynamically provision a volume. This typically occurs when available static PVs do not match the PVC. In this case, the cluster can optionally be configured to use the default StorageClasses.

Binding

Here is a quick rundown of how the binding process works:

A user creates a PVC request, specifying a certain amount of storage and the required access modes.
A control loop, located in the master, looks for new PVCs, and then:
- Static - the control loop attempts to find a matching PV and then binds it to the PVC.
- Dynamic - if a PV was already dynamically provisioned for the PVC, the control loop binds them together.
After a PVC and a PV are bound, they remain exclusive.

A PVC to PV binding is a one-to-one mapping. The process uses a ClaimRef, which creates a bi-directional binding between the PV and the PVC.

Using

Pods use claims as volumes. Here is how the using process works:

The cluster inspects the claim to find the bound volume and then mounts that volume.
When using volumes that support multiple access modes, users can specify the desired mode.
Once a user is provisioned a volume, the bound PV belongs to the user.

It is also possible to schedule Pods. Users can schedule Pods and access a claimed PV by including a PersistentVolumeClaim section in the volumes block of the Pod.

Reclaiming

A reclaim policy for PVs tells the cluster what to do with the volume after the claim is released, and volumes can be Retained, Recycled, or Deleted. Once users do not need their volume anymore, they can delete the PVC objects from the API that allows reclamation of the resource.

Types of Persistent Volumes (PV)

Kubernetes comes with numerous plugins that let you make different types of storage resources available to nodes in the Kubernetes cluster.

In-Tree Plugins

These are plugins shipped together with the Kubernetes distribution, which are implemented using the StorageClass object. Here are some of the main plugins currently supported:

Cloud Storage and Virtualization	Proprietary Storage Platforms	Physical Drives / Storage Protocols
GCEPersistentDisk	Flocker	NFS
AWSElasticBlockStore	RBD (Ceph Block Device)	iSCSI
AzureFile	Cinder (OpenStack block storage)	FC (Fibre Channel)
AzureDisk	Glusterfs
VsphereVolume	Flexvolume
	Quobyte Volumes
	Portworx Volumes
	ScaleIO Volumes
	StorageOS

For more details on these plugins, see the StorageClass documentation.

Read our blog post on the Kubernetes NFS integration.

CSI Plugins

Until recently, it was challenging to develop new storage volume plugins for Kubernetes. All volume plugins were “in tree”, meaning they were shipped together with the Kubernetes distribution, and vendors creating plugins had to align with the Kubernetes release process.

In 2019 Kubernetes adopted the Container Storage Interface (CSI), which made Kubernetes volumes extensible. Any storage equipment developer can easily write a CSI plugin exposing their storage system, without having to touch the core Kubernetes code. Here is the full list of CSI drivers available for use with Kubernetes.

Read our blog post on Container Storage Interface (CSI).

Persistent Volumes Features

Kubernetes Persistent Volumes offer powerful capabilities. The most important are detailed below.

Capacity	The capacity attribute lets you set the maximum storage capacity of the PV. Storage is specified in bytes, to ensure quantities are standard across all storage services and devices.
Volume Mode	By default, Kubernetes creates a file system on the PV, but if desired, you can use a raw block device directly without an additional layer.
Access Modes	A PV can have the following access modes: ● ReadWriteOnce—enables read and write and can be mounted by only one node ● ReadOnlyMany—enables read only and can be mounted by multiple nodes simultaneously ● ReadWriteMany—both read and write, can be mounted by several nodes simultaneously ● ReadWriteOncePod (alpha)—volume can be mounted as read-write by a single pod Note: Different storage plugins may only support some of these access modes.
Reclaim Policy	The reclaim policy specifies what happens when the node no longer needs the persistent storage. It can be set to Retain, meaning the PV is kept alive until it is explicitly deleted; Recycle, meaning the data is scrubbed but can be restored later; and Delete, meaning it is irreversibly deleted. Note: Different storage plugins may only support some of these reclamation policies.
Phase	A PV goes through the following lifecycle phases, which are visible to other entities in the cluster: ● Available—free for use, binding has not occured yet ● Bound—the PV was matched to a PersistentVolumeClaim and binding has occurred ● Released—the user deleted their PVC, but the PV is not yet reclaimed by the cluster ● Failed—the PV could not be reclaimed by the cluster automatically

Choosing Kubernetes Storage Solutions

Kubernetes has revolutionized application development, deployment, and scaling. However, it doesn’t support container data storage, so you need to deploy external mechanisms to make data available when you restart a cluster.

The most popular option is a cloud storage solution that supports containerized applications. Cloud native storage offerings can reproduce the conditions of a cloud environment, enabling scalability, high availability, and container-based architecture. They integrate with Kubernetes to offer persistent storage.

A major benefit of Kubernetes is the support it has from the major cloud providers and does not impose vendor lock-in. You can manage clusters in multi-cloud deployments using services from different vendors. However, the external storage solution must also support portability and integration with your existing monitoring tools. It must also offer high availability and performance, with the ability to scale according to dynamic system demands. An effective solution should also enable fast recovery in the event of data loss.

Kubernetes Persistent Storage with Cloud Volumes ONTAP

NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload.

In particular, Cloud Volumes ONTAP provides Kubernetes integration for persistent storage requirements of containerized workloads.

Learn More about Kubernetes Persistent Volume

5 Types of Kubernetes Volumes and How to Work with Them

A Kubernetes volume is a directory containing data, which can be accessed by containers in a Kubernetes pod. Understand the main types of Kubernetes volumes, including persistent volumes and ephemeral volumes, learn about volumeMounts, deploying volumes, and more.

Storage Abstraction on Kubernetes: OpenEBS Vs. Cloud Volumes ONTAP

This article is a side-by-side comparison of two technologies used to deploy Kubernetes persistent volumes for stateful applications: the open-source OpenEBS and NetApp’s Cloud Volumes ONTAP.

What are the tradeoffs between the two technologies, and which is right for enterprise-level Kubernetes deployments?

Read more: Storage Abstraction on Kubernetes: OpenEBS Vs. Cloud Volumes ONTAP.

Advanced Features on K8s with Cloud Volumes ONTAP: Scaling, Monitoring, and More

As the backend storage management system for Kubernetes clusters, Cloud Volumes ONTAP provides a number of advanced features. This blog post gives you an overview of specific Cloud Volumes ONTAP features for Kubernetes-based stateful applications and takes a look at how enterprises can benefit from them.

See Additional Guides on Key Microservices Topics

Together with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of microservices.