hamburger icon close icon
Kubernetes Storage

Google Kubernetes Engine: Ultimate Quick Start Guide

What is Google Kubernetes Engine (GKE)?

Google Kubernetes Engine (GKE) allows you to deploy, manage and scale containerized applications in a managed environment. GKE comprises a cluster of multiple machines, or Compute Engine instances.

GKE uses the Kubernetes open-source cluster orchestration system to manage clusters. It lets you easily deploy clusters with pre-configured workload settings, autoscale pods and clusters, manage networking and security. Google Kubernetes Engine takes care of the heavy lifting of cluster configuration, and lets you deploy and manage applications, set policies, monitor and administer your workloads using regular Kubernetes commands.

This is part of our series of articles about Kubernetes storage.

In this article, you will learn:

Google Kubernetes Engine Key Features

Here are the main features of Google’s managed Kubernetes service.

Autopilot Mode

GKE provides clusters with pre-configured workload settings, taking responsibility not only for master nodes, but also for worker nodes. Google calls this a “nodeless experience”. It improves operational efficiency and enhances security, restricting access to the Kubernetes API by default and preventing node changes. It can also conserve costs, because it reduces unused capacity and overheads.

Pod and Cluster Autoscaling

GKE performs balanced auto-scaling of pods based on CPU utilization or custom metrics. It manages auto-scaling for each node pool within the cluster, with continuous analysis of pod CPU and memory usage, dynamically adjusting CPU and memory available to the node group. You can automatically scale clusters across multiple node pools according to changing workload requirements.

Run Custom and Marketplace Kubernetes Applications

GKE lets you run your own, or any publicly available container images, but also leverage pre-Kubernetes applications available on the Google Cloud Marketplace. These are enterprise-grade containerized solutions with deployment templates, simplified licensing and on-demand pricing, which you can easily plug into your Kubernetes clusters.

Workload and Network Security

GKE security capabilities include:

  • GKE Sandbox—prevents untrusted code from interacting with the host kernel
  • Kubernetes network policies—natively supported, with ability to limit traffic with pod-level firewall rules
  • Private clusters—restricted to private or public endpoints that can only be accessed through a specific range of addresses

Google Kubernetes Engine Tutorial: Creating Your First GKE Cluster

Here is how to create a simple, single-node cluster on Google Kubernetes Engine.


Start by enabling the API service for your Kubernetes project. In the Google Cloud Console, navigate to the Kubernetes Engine page. Select a project and activate the API. While waiting for these services to take effect, make sure that project billing is active.

Choosing a Shell

When setting up a cluster, you can use the local shell or Google's Cloud Shell, which comes pre-installed with two useful CLI tools: kubectl (used to manage Kubernetes nodes) and gcloud (used to manage cloud functions).

Creating a GKE Cluster

To create a simple cluster with Kubernetes master nodes and one worker node, use the following command: gcloud container clusters create {Cluster name} --num-nodes=1
Of course, a single-node cluster is only suitable for testing purposes.

Get Authentication Credentials for the Cluster

After creating a cluster, you need to set up your credentials before you can interact with it. To do this, use the following command, which configures kubectl with your credentials. gcloud container clusters get-credentials {Cluster name}
That’s it! You can experiment with running workloads on your simple cluster, and when ready, try running clusters with additional nodes.

Google Kubernetes Engine Best Practices

Here are a few best practices that will help you make the most of Google Kubernetes Engine.

GKE Autoscaling

Kubernetes provides several autoscaling mechanisms, but they can be difficult to configure and maintain. GKE manages pod and cluster autoscaling for you, provisioning sufficient resources for your workloads, while ensuring you do not run more resources than necessary.

Vertical Pod Autoscaler

GKE manages the Vertical Pod Autoscaler (VPA), which monitors pods over time to determine the optimal requirements for CPU and memory resources.

VPA allows you to: 

  • Set the right resources to achieve cost efficiency and maintain stability (too few and your application could fail due to insufficient memory, too many and you waste money on redundant resources)
  • Handle stateless and stateful workloads that are not managed by the Horizontal Pod Autoscaler (see below) or if you cannot accurately estimate pod resource requests

Horizontal Pod Autoscaler

GKE also manages the Horizontal Pod Autoscaler (HPA). While VPA works gradually to anticipate processing and memory needs, it adds and deletes pod replicas in response to usage changes. It is designed to scale applications based on their load metrics.

HPA allows you to:

  • Configure custom load metrics like CPU utilization
  • Add and delete replicas of pods
  • Handle stateless workers that can react quickly to spikes and drops in usage, maintaining workload stability

Cluster Autoscaler

Finally, GKE helps manage the Cluster Autoscaler (CA), which scales the underlying infrastructure by providing and removing nodes for pods to run on, based on current demand. CA follows scheduling and declare pod requests, and doesn’t rely on load metrics as do HPA and VPA.

CA allows you to:

  • Optimize your infrastructure costs by automatically selecting the least expensive node type that can fulfil demand
  • Automatically remove under-utilized nodes and provide nodes for pods that don’t fit in the existing cluster
  • Ensure your infrastructure grows to meet the demand for capacity as determined by VPA and HPA

Choose the Right Machine Type

GKE offers other configurations, aside from autoscaling, that allow you to optimize the cost of running Kubernetes applications. E2 machine types (E2 VMs) allow you to save up to 55% when compared to N1 machine types. Preemptible VMs (PVMs), also known as spot instances, can be 80% cheaper when compared with standard Compute Engine VMs.

E2 VMS are cost-optimized for a range of environments, including web servers, microservices, development environments, small-to-medium databases, and business-critical applications.

PVMs are Compute Engine VM instances that can run for up to 24 hours and may be terminated with 30 seconds notice. Use PVMs to run fault-tolerant or batch jobs that can deal with nodes being stopped with short notice—PVMs can be shut down with 30 seconds’ prior notice.

Don’t use PVMs for serving or stateful workloads unless you have prepared your architecture to compensate for their constraints. They don’t provide any availability guarantees, so you should use them with caution with GKE clusters.

Enable GKE Usage Metering

GKE usage metering allows you to keep track of your usage and make sense of your overall GKE cluster costs. You can see which application or workload is using up the most resources, or if a particular component or environment has caused a sudden spike in usage.

You can also compare utilization with resource requests, and assess which workloads are inadequately provisioned. This allows you to identify sources of waste or insufficiencies.

GKE usage metering provides:

  • An overview with approximate cost breakdowns
  • Usage profiles for GKE clusters, made accessible with labels and namespaces
  • Information about resource requests and consumption of cluster workloads, including CPU, GPU, memory, storage and network egress
  • Default Data Studio templates and customizable dashboards

Use Shielded Nodes

While GCP has high standards of security, running workloads on multi-tenant infrastructure still carries some risks. No operating system is completely safe from attacks, whether launched via a malicious process running directly on the operating system, or external manipulations originating from the network or infrastructure.

GKE clusters now support Shielded Nodes, which can mitigate risks associated with multi-tenant cloud environments. These nodes use specially hardened Shielded Google Compute Engine VMs, to safeguard and monitor the runtime integrity of your nodes, starting during the boot process.

Shielded Nodes can be enabled at any time for a cluster. Note that once a cluster has been configured to use Shielded Nodes, standard nodes will no longer be able to join the cluster.

Optimizing GKE Storage with NetApp Cloud Volumes ONTAP

NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP capacity can scale into the petabytes, and it supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.

In particular, Cloud Volumes ONTAP supports Kubernetes Persistent Volume provisioning and management requirements of containerized workloads.

Learn more about how Cloud Volumes ONTAP helps to address the challenges of containerized applications in these Kubernetes Workloads with Cloud Volumes ONTAP Case Studies.

New call-to-action
Yifat Perry, Technical Content Manager

Technical Content Manager