Google Databases – Study Notes

Overview

In the study note series, this post covers the broad Google Database options available. All details are accurate at the time of writing, please refer to Google for current details.

There are many existing GCP Database comparisons already out there, however I haven’t found any that include details like ACID compliance, SQL vs NoSQL, ingestion methods, etc. I have therefore put together a blog that:

  • Differentiates between them
  • Describes how to load data into them
  • And in the case of the NoSQL, include examples of how to query them

Comparison

The following are the key features that will determine when to use a given option.

OptionSQLCategoryIndexingWorkloadStorage LimitsACID[i]
Cloud SQLYesRDBMSYesOLTP30,720 GBYes
Cloud SpannerYesRDBMSYesOLTP[ii]2TB per NodeYes
Cloud DatastoreNoDocumentYesDocumentEntities <= 1 MbPartial
Cloud BigtableNokey/valueYesHTAP[iii]PetabytesNo
BigQueryYes[iv]RDBMSNoOLAP[v]PetabytesYes
Cloud FirestoreNoDocumentYesDocumentDocument <= 1 MiBYes
Database Comparisons

Ingestion

Ok, assuming you have chosen a Database option, the next question is how to ingest data.

ToolOptions
Cloud SQLGCP Console from a CSV file Google Cloud Storage (GCS)Cloud Dataflow from any compatible source
Cloud SpannerGCP Console from an AVRO file GCSCloud Dataflow from any compatible source
Cloud DatastoreManaged Export and Import service via Command line from GCSFrom BigQuery via consoleCloud Dataflow from any compatible source
Cloud BigtableCloud Dataflow from any compatible source
BigQueryGCP Console, command line, API, or client library from Avro, CSV, JSON, ORC or Parquet files in GCSGCP Console from Cloud Datastore exports in GCSGCP Console from Cloud Firestore exports in GCSCloud Dataflow from any compatible source
Cloud FirestoreGCP Console from Cloud Firestore exports in GCSCloud Dataflow from any compatible source
Ingestion Options

NoSQL Querying

The final question is how do we extract the data we have ingested. No doubt we are all familiar with SQL but less so with NoSQL, the following are example Python queries

OptionExamples
Cloud Datastorequery = client.query(kind=’Task’) query.add_filter(‘done’, ‘=’, False) query.add_filter(‘priority’, ‘>=’, 4) query.order = [‘-priority’] query = client.query(kind=’Task’) query.add_filter(‘done’, ‘=’, False) From  https://cloud.google.com/datastore/docs/concepts/queries
Cloud Bigtabletable = instance.table(table_id)
row_key = ‘r1’row = table.read_row(row_key.encode(‘utf-8’))column_family_id = ‘cf1’column_id = ‘c1’.encode(‘utf-8’)value = row.cells[column_family_id][column_id][0].value.decode(‘utf-8’)print(‘Row key: {}\nData: {}’.format(row_key, value))From  https://cloud.google.com/bigtable/docs/reference/libraries
Cloud Firestorevar citiesRef = db.collection(“cities”);
var query = citiesRef.where(“state”, “==”, “CA”);From  https://firebase.google.com/docs/firestore/query-data/queries
Code Examples

[i] ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties of database transactions[ii] Online Transactional Processing[iii] Hybrid Transactional/Analytical Processing[iv] Two versions, Legacy and Standard[v] Online Analytical Processing

Leave a Reply

Your email address will not be published. Required fields are marked *