hamburger icon close icon
Kubernetes Storage

Kubernetes and Persistent Apps: An Interview with Michael Shaul

The Kubernetes ecosystem is a very popular choice to host and orchestrate a huge variety of container-based workloads. Yet, applications that require persistent storage have always been a challenge with Kubernetes.

Michael Shaul, the NetApp Principal Technologist, sat down with Bruno Amaro Almeida, independent Advisor/Architect and frequent NetApp BlueXP blog contributor, to discuss the challenges, limitations and solutions of running persistent applications in Kubernetes.

In this article, they share their insights and points of view on Kubernetes and persistent applications.

1547396171482 Michael Shaul is NetApp’s Principal Technologist. Based in Israel  and with a long career working with data management and  infrastructure, Michael is part of the Cloud Data Services CTO  office and has a unique in-depth perspective of NetApp cloud  technologies.

BrunoAlmeida Bruno Amaro Almeida is an independent advisor and architect.  Based in Finland and working in the areas of governance, cloud,  security and data engineering, Bruno has a balanced perspective  coming from his experience helping organizations on both CxO  cloud and data strategy and hands-on engineering execution.

Use the links below to jump down to read the answers to:

Or watch the full interview video here:

What are the key differences and challenges regarding running stateless and stateful workloads in Kubernetes?

Michael Shaul (MS): Let’s start to state the obvious, there are huge differences between having to run and operate a stateful or stateless workload.

In Kubernetes' early days, the platform was clearly made to run stateless applications. A lot of the tools that you needed inside Kubernetes to be able to operate a stateful application simply didn’t exist.

In the last few years that has changed significantly, with the Kubernetes community starting to build those capabilities and raising the importance of storage in the Kubernetes ecosystem. One key enabler to that was the Kubernetes Container Storage Interface (CSI). Similarly to what was done to network in Kubernetes, with CSI the community decided to externalize the storage from the Kubernetes platform by building APIs and integration points, allowing third-party vendors to bring their storage solutions to the Kubernetes ecosystem in a native and transparent manner.

The key challenge to address with running persistent applications in Kubernetes is the storage robustness. Data and storage aren’t as transparent to Kubernetes developers as other parts of the ecosystem, such as the container pods, load balancing, and autoscaling.

With statefulness something needs to remain when your Kubernetes pod fails. If your storage is not robust enough, it can be hard to understand if and when something breaks, so you definitely should integrate your Kubernetes CSI with a storage infrastructure you can rely on.

Developers shouldn’t have to worry if their storage can scale and if their data is protected. Likewise, with an abstraction such as CSI, it’s not up to Kubernetes to make sure the storage infrastructure is reliable and robust enough to handle your workload.

At NetApp, this is our specialty. We provide Kubernetes with the same reliable and robust storage infrastructure environments that our enterprise customers use with systems such as Oracle DBs and SAP. Our storage robustness, transparency, and scalability are available to Kubernetes from the get-go.

Learn more about Managing Stateful Applications in Kubernetes and Dynamic Kubernetes Persistent Volume Provisioning with NetApp Trident and Cloud Volumes ONTAP.

Will serverless and stateless initiatives such as KNative change the landscape?

BA: While the Kubernetes community is working towards building more stateful and data-resilient capabilities into the ecosystem, other streams of work, such as KNative, have been making it easier to manage stateless and serverless workloads. Will these worlds clash or merge over time?

MS: The Kubernetes stateless and stateful worlds will definitely get closer and closer over time. Looking at ourselves at NetApp, we built this nice robust engineering platform with plenty of knobs that you can turn based on your needs.

However, our ambition is that you actually won’t need anyone to turn those knobs. We are building towards a storageless direction—pun on serverless intended!—where our automation and transparency enable development teams to use storage without having to worry about the underlying infrastructure. The platform is smart enough to operate itself in the best possible way, and automatically turns those knobs for data tiering, performance, and data protection to adapt to the workload needs.

What are the pros and cons of running your database in Kubernetes versus using a cloud managed database service?

BA: Databases are a somewhat controversial topic when it comes to stateful applications in Kubernetes. With every cloud provider offering multiple relational and non-relational managed database services, developers using Kubernetes often wonder if and when they should use them or deploy the database inside their Kubernetes clusters.

MS: The managed database services are great at reducing your operational overhead, however you are limited to what the cloud provider can give you. Without access to the underlying database infrastructure, you usually don’t have full access to modify the configuration and change it based on your business needs.

If you decide to self-manage the database, Kubernetes makes it a lot easier compared to running it on top of virtual machines. Out of the box, you get deployment templates, scalability, replication between database nodes, among other benefits. It definitely comes with its own set of challenges, and because of that we also see more database providers building native Kubernetes operators and integrations.

From a storage perspective, we also see the Kubernetes CSI developing further and enabling features that combined with the NetApp solution, making it easier to self-manage databases in Kubernetes, such as storage expansion, snapshot copies, tiering, and dedicated data resources for databases. Taking snapshots for example, in the NetApp Cloud Volumes ONTAP world, they don’t cost anything—only the data blocks that change are billed—so it’s quite inexpensive to clone a database to test before a major upgrade or modification without having to know anything about NetApp Cloud Volumes ONTAP.

What are the misconceptions regarding Kubernetes and persistent apps in hybrid and multicloud scenarios?

BA: Hybrid and multicloud scenarios are often associated with the usage of Kubernetes. Yet, there are several mistakes and misconceptions that business leaders have regarding those cloud strategies.

MS: When it comes to using Kubernetes in those types of scenarios, some people used to imagine that workloads can magically shift from one cloud to another with a push of a button. There were even organizations which attempted to do that by federating clusters across multiple cloud providers, but quickly realized that it is a massive engineering challenge to tackle. Don’t get me wrong, several organizations use Kubernetes with hybrid and multicloud strategies—because of cost, compliance, access to a specific geographical region, leveraging a certain cloud service, among other motivations—but it all boils down to a multi-factor decision, not a technical fantasy.

Persistent applications also make it a bit harder to accomplish that idealist vision. The Kubernetes storage APIs are essentially the same, but each cloud provider's built-in storage brings its own nuances. What NetApp does is bring that transparency from a data perspective, allowing you to shift data across different cloud environments. Moreover, when talking with our customers, we see the need to keep strict standard policies wherever you are.

It’s not just about having data anywhere you want; it’s having it how you need it to be, with compliance that works across different environments wherever data resides.

Going forward we see more and more stateful applications and workloads going to Kubernetes, and with both the community and vendors bringing new features, it will become even easier to operate and manage them.

With serverless capabilities also appearing in the stateful world, the underlying data infrastructure will also become more transparent and automated for the engineering teams, making it more adaptable to the needs of the business requirements.

New call-to-action

Principal Technologist

Principal Architect & Technology Advisor