BlueXP Blog

Understanding Google Cloud High Availability

Written by Yifat Perry, Technical Content Manager | Dec 9, 2020 6:51:45 AM

How Does Google Cloud Offer High Availability?

The term “availability" is used to describe the duration of service availability and the response time to user requests. High availability (HA) is a system feature designed to provide a consistent level of uptime for prolonged periods.

Google Cloud offers a robust and highly available architecture based on 24 regions and 73 availability zones. Google Compute Engine implements an abstraction layer between availability zones and physical clusters of machines in Google data centers. Each cluster of physical machines has its own independent software, power, cooling, network and security infrastructure, as well as its own computing and storage resources. A zone can be made up of one or more clusters.

Separating availability zones from clusters has several advantages:

  • Allows Compute Engine to balance resources between clusters within a region.
  • Compute Engine constantly adds clusters in each region, yet availability zones stay the same, which provides stability for workloads running on availability zones.

In this article we focus on high availability features Google provides for Compute Engine and the primary Google Cloud database service, Google Cloud SQL.

In this article, you will learn:

Google Compute Engine High Availability and Best Practices

To minimize failures, design your Compute Engine applications to be tolerant of errors, network failures, and unexpected disasters. A robust system should be able to handle errors correctly, for example by redirecting traffic from the failed instance to the active instance and be able to run automatically after a restart.

To design an error-tolerant system, create virtual machine instances across at least two availability zones located in two regions. This ensures that even if a zone or an entire region goes down, your application can continue working. If all of your instances are hosted in the same zone or region, your application will not be resilient to failure of Google infrastructure.

Instance Groups

GCP provides managed instance groups, a group of virtual machine instances that serve a common purpose. With a managed instance group, traffic can be routed to multiple virtual machines via load balancer, so if any individual instance fails, service is not disrupted.

Managed instance groups also provide the following features:

  • Autoscaling—automatically scales the number of VM instances in the group when loads increase
  • Autohealing—performs regular health checks, and if a VM instance is unhealthy, automatically recreates it.
  • Supports multiple zones—you can create a managed instance group across several availability zones in the same Google Cloud region.

Load Balancing

GCP provides managed load balancing that helps you manage high loads of traffic, to avoid overwhelming VM instances. The load balancer service provides:

  • Forwarding rules—deploy applications across multiple regions using regionally managed instance groups. You can configure forwarding rules to distribute traffic to all virtual machines in that region. Each forwarding rule can use a single external IP address, through which users can access your application.
  • Global load balancing—deploy instances in multiple regions using global load balancing. HTTP/HTTPS load balancing traffic allows you to run your application near to the geographical location of your users—for example, if you have users in Europe, you can route them to VM instances of the application running in a European Google Cloud region.
  • Redundancy—achieve redundancy by load balancing between regions. If one region is not available, traffic is automatically routed to another region, while your service can still be accessed using the same external IP address.
  • Autoscaling—automatically add or remove instances when load increases or decreases, and load balance between instances in a managed instance group.

Startup and Shutdown Scripts

Google Compute Engine provides start-up and shutdown scripts that are executed when an instance starts or stops. These scripts can automate tasks such as installing software, updating, backing up, and generating logs. Importantly, the scripts run in any event an instance is shut down—even unintentionally.

These scripts are an effective way to create a bootstrap procedure for your instances and shut them down cleanly. Instead of using a custom image to configure the instance, you can use a startup script. After each restart, the startup script runs and can be used to install or update software, and ensure the appropriate services are running.

The shutdown script can perform important actions like closing connections, saving state of transactions and backing up data.

Google Cloud SQL High Availability

Google Cloud SQL is a managed database service that supports database engines including SQL Server, MySQL and PostgreSQL, and can connect to almost any application. It provides backup and replication capabilities for high availability. Below we cover some of the high availability features provided by Cloud SQL.

Learn more about Google Cloud database services in our guides:

Cloud SQL Highly Available Instances

When creating a Cloud SQL instance in Google Cloud, you can choose a location where the instance will live, and its data will be stored. Cloud SQL can be deployed as a zonal resource, which is stored in two availability zones, a primary zone and a secondary zone.

However, you can also create a highly available Cloud SQL instance which is defined as a regional resource, and is deployed across multiple zones in one or more regions.

It is highly recommended to choose the same location for the database and the related Compute Engine instances or App Engine applications (for example, applications accessing the database), to reduce latency and improve availability.

Cloud SQL Regional and Multiregional Locations

You can create a highly available Cloud SQL instance in two types of locations:

  • Regional location—specific geographical locations, such as New York.
  • Multiregional location—an extended geographic area that includes two or more geographic locations, such as the United States.

The only difference between regional and multiregional locations is for backup purposes. A multiregional instance can save backups in multiple regions for higher resiliency.

Here are the regions and how they are grouped into multi-regional locations:

  • North America multiregional location (us)—includes Los Angeles, Salt Lake City, Oregon, Las Vegas, N. Virginia, S. Carolina, Iowa, and Montreal.
  • Europe multiregional location (eu)—includes London, Frankfurt, the Netherlands, Zürich, Belgium, and Finland.
  • Asia multiregional location (asia)—includes Tokyo, Osaka, Seoul, Taiwan, Hong Kong, Mumbai, Singapore, Jakarta
  • Non-multiregional locations—São Paolo, Sydney

Cloud SQL Use of Availability Zones

Regions are divided into availability zones. Each availability zone is completely independent from other availability zones in the region. Cloud SQL lets you select the zone for your Cloud SQL instance, to ensure it is as close as possible to the applications that need to access it. 

For zonal Cloud SQL instances, Cloud SQL lets you select a primary and secondary zone. For regional or multiregional resources, you can only select the primary zone, and other zones are selected automatically.

Google Cloud High Availability with NetApp Cloud Volumes ONTAP

NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.

In particular, Cloud Volumes ONTAP provides high availability, ensuring business continuity with no data loss (RPO=0) and minimal recovery times (RTO < 60 secs).