Google Databases - Study Notes - Data, Information and Analytics

Overview

In the study note series, this post covers the broad Google Database options available. All details are accurate at the time of writing, please refer to Google for current details.

There are many existing GCP Database comparisons already out there, however I haven’t found any that include details like ACID compliance, SQL vs NoSQL, ingestion methods, etc. I have therefore put together a blog that:

Differentiates between them
Describes how to load data into them
And in the case of the NoSQL, include examples of how to query them

Comparison

The following are the key features that will determine when to use a given option.

Option	SQL	Category	Indexing	Workload	Storage Limits	ACID[i]
Cloud SQL	Yes	RDBMS	Yes	OLTP	30,720 GB	Yes
Cloud Spanner	Yes	RDBMS	Yes	OLTP[ii]	2TB per Node	Yes
Cloud Datastore	No	Document	Yes	Document	Entities <= 1 Mb	Partial
Cloud Bigtable	No	key/value	Yes	HTAP[iii]	Petabytes	No
BigQuery	Yes[iv]	RDBMS	No	OLAP[v]	Petabytes	Yes
Cloud Firestore	No	Document	Yes	Document	Document <= 1 MiB	Yes

Database Comparisons

Ingestion

Ok, assuming you have chosen a Database option, the next question is how to ingest data.

Tool	Options
Cloud SQL	GCP Console from a CSV file Google Cloud Storage (GCS)Cloud Dataflow from any compatible source
Cloud Spanner	GCP Console from an AVRO file GCSCloud Dataflow from any compatible source
Cloud Datastore	Managed Export and Import service via Command line from GCSFrom BigQuery via consoleCloud Dataflow from any compatible source
Cloud Bigtable	Cloud Dataflow from any compatible source
BigQuery	GCP Console, command line, API, or client library from Avro, CSV, JSON, ORC or Parquet files in GCSGCP Console from Cloud Datastore exports in GCSGCP Console from Cloud Firestore exports in GCSCloud Dataflow from any compatible source
Cloud Firestore	GCP Console from Cloud Firestore exports in GCSCloud Dataflow from any compatible source

Ingestion Options

NoSQL Querying

The final question is how do we extract the data we have ingested. No doubt we are all familiar with SQL but less so with NoSQL, the following are example Python queries

Option	Examples
Cloud Datastore	query = client.query(kind=’Task’) query.add_filter(‘done’, ‘=’, False) query.add_filter(‘priority’, ‘>=’, 4) query.order = [‘-priority’] query = client.query(kind=’Task’) query.add_filter(‘done’, ‘=’, False) From https://cloud.google.com/datastore/docs/concepts/queries
Cloud Bigtable	table = instance.table(table_id) row_key = ‘r1’row = table.read_row(row_key.encode(‘utf-8’))column_family_id = ‘cf1’column_id = ‘c1’.encode(‘utf-8’)value = row.cells[column_family_id][column_id][0].value.decode(‘utf-8’)print(‘Row key: {}\nData: {}’.format(row_key, value))From https://cloud.google.com/bigtable/docs/reference/libraries
Cloud Firestore	var citiesRef = db.collection(“cities”); var query = citiesRef.where(“state”, “==”, “CA”);From https://firebase.google.com/docs/firestore/query-data/queries

Code Examples

[i] ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties of database transactions[ii] Online Transactional Processing[iii] Hybrid Transactional/Analytical Processing[iv] Two versions, Legacy and Standard[v] Online Analytical Processing

Overview

Comparison

Ingestion

NoSQL Querying

Leave a Reply