Google Cloud Platform (GCP) provides a wide range of computing resources, including database services. GCP offers three types of reference architectures for global data distribution—hybrid, multicloud, and regional distribution. When choosing a Google database service, you should take these architectures into consideration.
In this post, we’ll explain data distribution in GCP, and provide an overview of popular Google cloud database services, including key considerations when assessing and choosing a service. We’ll also show how NetApp Cloud Volumes ONTAP can help centralize and simplify the management of Google cloud database resources.
This is part of an extensive series of guides about managed services.
In this article, you will learn:
Google Cloud Platform (GCP) supports three primary deployment models: single cloud, hybrid, and multicloud.
The simplest deployment model is to deploy databases on Google Cloud only, via:
Hybrid deployments are useful when you have applications in the cloud that need to access on-premises databases or vice versa. For example, if you are performing marketing analytics on-premises and need to access customer databases hosted in the cloud.
There are three primary considerations for deployment a database in a hybrid model - with some data on Google Cloud and some on-premises:
The following diagram illustrates an example of a hybrid architecture with Google Cloud and on-premises systems.
Multicloud deployments enable you to combine databases deployed on Google Cloud with database services from other cloud providers. This can help you create multiple fail-safes, more effectively distribute your database, or take advantage of a wider array of proprietary cloud features.
When considering a multicloud deployment you should be aware of the following:
The following diagram illustrates a multicloud deployment involving GCP and another public cloud provider.
GCP offers several Google Cloud database services you can choose from. Below is an introduction to each.
Cloud SQL is a fully managed, relational Google Cloud database service that is compatible with SQL Server, MySQL, and PostgreSQL. It includes features for automated backups, data replication, and disaster recovery to ensure high availability and resilience. You can integrate this service with Compute Engine, App Engine, BigQuery, and Kubernetes.
Common use cases for Cloud SQL include:
Cloud Spanner is another fully managed, relational Google Cloud database service. It differs from Cloud SQL by focusing on enabling you to combine the benefits of relational structure and non-relational scalability. It provides strong consistency across rows and high-performance operations. It includes features for automatic replication, built-in security, and multi-language support.
Use cases for Cloud Spanner include:
BigQuery is a fully managed, serverless data warehouse. You can use it to perform data analyses via SQL and query streaming data. This service includes a built-in Data Transfer Service to help you migrate data from on-premises resources, including Teradata.
BigQuery includes features for machine learning, business intelligence, and geospatial analysis. These features are provided through BigQuery ML, BI Engine, and GIS.
Use cases for BigQuery include:
Cloud Bigtable is a fully managed NoSQL Google Cloud database service. It is designed for large operational and analytics workloads. Cloud Bigtable includes features for high availability, zero-downtime configuration changes, and sub-10ms latency. You can integrate it with a variety of tools, including Apache tools like Hadoop, TensorFlow, and Google Cloud services like BigQuery.
Use cases for Cloud Bigtable include:
Cloud Firestore is a fully managed, serverless NoSQL Google Cloud database designed for the development of serverless apps. You can use it to store, sync, and query data for web, mobile, and IoT applications. It includes features for offline support, live synchronization, and built-in security. You can integrate Firestore with Firebase, GCP’s mobile development platform, for easier app creation and management.
Use cases for Cloud Firestore include:
Realtime Database is a NoSQL Google Cloud database that is part of the Firebase platform. It enables you to store and sync data in real-time and includes caching capabilities for offline use. Realtime Database also enables you to implement declarative authentication, matching users by identity or pattern matching. It includes mobile and web software development kits (SDKs) for easier and faster app development.
Use cases for Firebase Realtime Database include:
Cloud Memorystore is a fully managed, in-memory Google Cloud data store. It is designed to be secure, highly available, and scalable. Cloud Memorystore enables you to create application caches with sub-millisecond latency for data access. It is compatible with Memcached and Redis protocols.
Use cases for Cloud Memorystore include:
Even after you explore your database options in Google Cloud, deciding which are the right options for you can be a challenge. When considering your options, keep in mind that many organizations need and can benefit from using multiple services. This enables you to optimize your implementations according to database capabilities, rather than trying to adapt a database service to fit all needs.
Cloud SQL
Cloud SQL is a good option when you need relational database capabilities but don’t need storage capacity over 10TB or more than 4000 concurrent connections. You also need to be skilled at on-premise management.
Cloud Spanner
Cloud Spanner is a good option when you plan to use large amounts of data (more than 10TB) and need transactional consistency. It is also good if you want to use sharding for higher throughput and accessibility.
If you know or think that you might eventually need to be able to horizontally scale your Google Cloud database, Cloud Scanner is a better option than Cloud SQL. If you start with Cloud SQL and need to eventually move to Cloud Spanner, be prepared to re-write your application in addition to migrating your database.
Cloud Firestore/Datastore
Cloud Firestore or Datastore are good options when you plan to focus on app development and need live synchronization and offline support.
If you need to store unstructured data in JSON documents, Cloud Datastore is the recommended option. This is in comparison to if you need to store structured data, in which case Cloud Spanner is recommended.
An additional factor to consider is whether you need atomicity, consistency, isolation, durability (ACID) compliance. If so, you need to choose Cloud Spanner since Cloud Datastore only offers atomic and durable transactions.
Cloud Bigtable
Cloud Bigtable is a good option if you are using large amounts of single key data. In particular, it is good for low-latency, high throughput workloads.
If you need to perform single-region analytics, Cloud Bigtable is preferred over Cloud Spanner. However, if you need multi-regional operations, Cloud Spanner is the recommended solution. For example, Cloud Bigtable is a good option for a time series app created for DevOps monitoring. Meanwhile, Cloud Spanner is the recommended option for an infrastructure monitoring platform designed for software as a service (SaaS) offering.
Cloud Memorystore
Cloud Memorystore is a good option if you are using key-value datasets and your primary concern is transaction latency.
If you do not need disk-based data persistence and are only using the service for caching, Cloud Memorystore should be your choice. However, if you are concerned about issues like cache to database consistency or stream processing, you should choose Cloud Bigtable. Likewise, any time that your volume of data is too big to fit into memory, Cloud Memorystore is not the best option for you.
NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.
In particular, Cloud Volumes ONTAP helps in addressing database workloads challenges in the cloud, and filling the gap between your cloud-based database capabilities and the public cloud resources it runs on.
Cloud Firestore enables you to store web and mobile applications data, in Google Cloud Platform (GCP). You can leverage Cloud Firestore for real time synchronization between client applications, by using listeners.
This article explains what Cloud Firestore is, how it works, and notes the differences between Cloud Firestore and Realtime Database. Including best practices for Cloud Firestore implementations.
Read more: Cloud Firestore: An In-Depth Look.
Google Cloud Analytics services provide various capabilities you can use to leverage data to improve customer experience and democratize the use of data across various collaborators. Learn how to build efficient architectures while using Google services.
Read more: 8 Types of Google Cloud Analytics: How to Choose?
High availability provides a consistent level of uptime, ensuring workloads experience minimal failure. In GCP, this is achieved by leveraging 24 regions and 73 availability zones and a Compute Engine.
Read more: Understanding Google Cloud High Availability.
There are several ways to run MySQL on Google Cloud. You can use Google Cloud SQL, which is a managed Google Cloud service. Alternatively, you can use a Google Cloud Marketplace image to install MySQL on a Compute Engine instance. It is also possible to manually install MySQL on Compute Engine.
This article provides an in-depth look at these three deployments options.
Read more: Google Cloud MySQL: The Complete Guide
Google Cloud PostgreSQL is a fully managed Google Cloud database service, which allows you to automatically provision and manage PostgreSQL database instances. Learn about the Google Cloud PostgreSQL managed service, and the pros and cons of managed vs. self-managed PostgreSQL on Google Cloud.
Read more: Google Cloud PostgreSQL: Managed or Self Managed?
The Google Cloud Platform provides multiple services that support big data storage and analysis. Possibly the most important is BigQuery, a high performance SQL-compatible engine that can perform analysis on very large data volumes in seconds.
Learn how Google Cloud Big Data services can help you build a robust big data infrastructure.
Read more: Google Cloud Big Data: Building Your Big Data Architecture on GCP.
Google’s cloud platform (GCP) offers a wide variety of database services. Of these, its NoSQL database services are unique in their ability to rapidly process very large, dynamic datasets with no fixed schema. Learn about the big three Google Cloud NoSQL offerings, providing high performance data access for web applications, mobile applications, and huge scale datasets.
Read more: Google Cloud NoSQL: Firestore, Datastore, and Bigtable.
A data lake is a central repository designed to store, process, and protect large volumes of structured, semi-structured and unstructured data. You can store the data in its native format and use a variety of data without considering size limitations. Learn about the four phases in a Google Cloud data lake lifecycle, and the tools and services Google provides for implementing them.
Read more: Google Cloud Data Lake: 4 Phases of the Data Lake Lifecycle.
Google Cloud SQL is a managed database service that allows you to run Microsoft SQL Server, MySQL, and PostgreSQL on Google Cloud. The service provides replication, automated backups, and failover to ensure high-availability and resilience. In addition, it provides an easy and fast way to deploy and operate an SQL database in your cloud.
This post introduces the Google Cloud SQL service, explains the features that Google provides for each type of database, the costs, and how to start your first database.
Read more Google Cloud SQL: MySQL, Postgres and MS SQL on Google Cloud.
Google Cloud SQL is a database service that offers managed versions of SQL Server, MySQL, and PostgreSQL. This service can provide significant benefits over on-premises implementations. However, before signing up, you should consider both pricing and its limitations.
This article explains the various pricing breakdowns of SQL database services in Google Cloud, covers the limitations of Google Cloud SQL, and highlights how you can optimize costs with Cloud Volumes ONTAP.
Read more: Google Cloud SQL Pricing, and Limits: A Cheatsheet for Cost Optimization
Google Cloud Datastore is a highly scalable, managed NoSQL database hosted on the Google Cloud Platform.
Google has released Firestore, a new version of Datastore with several improvements and additional features. In future, existing Datastore databases will be automatically upgraded to Firestore.
Read more: Should You Still Be Using Google Cloud Datastore?
Google Cloud Dataflow is a managed service used to execute data processing pipelines based on Apache Beam via the Google Cloud Platform (GCP).
Dataflow is a fully pipeline runner that does not require initial setup of underlying resources. Because it is fully integrated with the Google Cloud Platform (GCP), it can easily combine other Google Cloud big data services, such as Google BigQuery.
Read more: Google Cloud Dataflow: The Basics and 4 Critical Best Practices
Together with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of managed services.
Authored by NetApp
Authored by NetApp
Authored by Atlantic