The term “availability" is used to describe the duration of service availability and the response time to user requests. High availability (HA) is a system feature designed to provide a consistent level of uptime for prolonged periods.
Google Cloud offers a robust and highly available architecture based on 24 regions and 73 availability zones. Google Compute Engine implements an abstraction layer between availability zones and physical clusters of machines in Google data centers. Each cluster of physical machines has its own independent software, power, cooling, network and security infrastructure, as well as its own computing and storage resources. A zone can be made up of one or more clusters.
Separating availability zones from clusters has several advantages:
In this article we focus on high availability features Google provides for Compute Engine and the primary Google Cloud database service, Google Cloud SQL.
In this article, you will learn:
To minimize failures, design your Compute Engine applications to be tolerant of errors, network failures, and unexpected disasters. A robust system should be able to handle errors correctly, for example by redirecting traffic from the failed instance to the active instance and be able to run automatically after a restart.
To design an error-tolerant system, create virtual machine instances across at least two availability zones located in two regions. This ensures that even if a zone or an entire region goes down, your application can continue working. If all of your instances are hosted in the same zone or region, your application will not be resilient to failure of Google infrastructure.
GCP provides managed instance groups, a group of virtual machine instances that serve a common purpose. With a managed instance group, traffic can be routed to multiple virtual machines via load balancer, so if any individual instance fails, service is not disrupted.
Managed instance groups also provide the following features:
GCP provides managed load balancing that helps you manage high loads of traffic, to avoid overwhelming VM instances. The load balancer service provides:
Google Compute Engine provides start-up and shutdown scripts that are executed when an instance starts or stops. These scripts can automate tasks such as installing software, updating, backing up, and generating logs. Importantly, the scripts run in any event an instance is shut down—even unintentionally.
These scripts are an effective way to create a bootstrap procedure for your instances and shut them down cleanly. Instead of using a custom image to configure the instance, you can use a startup script. After each restart, the startup script runs and can be used to install or update software, and ensure the appropriate services are running.
The shutdown script can perform important actions like closing connections, saving state of transactions and backing up data.
Google Cloud SQL is a managed database service that supports database engines including SQL Server, MySQL and PostgreSQL, and can connect to almost any application. It provides backup and replication capabilities for high availability. Below we cover some of the high availability features provided by Cloud SQL.
Learn more about Google Cloud database services in our guides:
When creating a Cloud SQL instance in Google Cloud, you can choose a location where the instance will live, and its data will be stored. Cloud SQL can be deployed as a zonal resource, which is stored in two availability zones, a primary zone and a secondary zone.
However, you can also create a highly available Cloud SQL instance which is defined as a regional resource, and is deployed across multiple zones in one or more regions.
It is highly recommended to choose the same location for the database and the related Compute Engine instances or App Engine applications (for example, applications accessing the database), to reduce latency and improve availability.
You can create a highly available Cloud SQL instance in two types of locations:
The only difference between regional and multiregional locations is for backup purposes. A multiregional instance can save backups in multiple regions for higher resiliency.
Here are the regions and how they are grouped into multi-regional locations:
Regions are divided into availability zones. Each availability zone is completely independent from other availability zones in the region. Cloud SQL lets you select the zone for your Cloud SQL instance, to ensure it is as close as possible to the applications that need to access it.
For zonal Cloud SQL instances, Cloud SQL lets you select a primary and secondary zone. For regional or multiregional resources, you can only select the primary zone, and other zones are selected automatically.
NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.
In particular, Cloud Volumes ONTAP provides high availability, ensuring business continuity with no data loss (RPO=0) and minimal recovery times (RTO < 60 secs).