Google BigQuery

Guide to integrate your BigQuery with Sprinkle

This page covers the details about integrating BigQuery with Sprinkle.

When setting up BigQuery connection, Sprinkle additionally requires a Cloud bucket. This guide covers the role of all the components and steps to setup.

Step by Step Guide

Integrating BigQuery

STEP-1: Create a Service Account

Create a service account which will be used by Sprinkle to connect to BigQuery

  • Create a Service Account, provide any name like β€œsprinkle”.

  • In the service account, provide permission - BigQuery Admin role

  • Create a JSON key for this service account, and download it

STEP-2: Create a BigQuery dataset

Create a BigQuery dataset, provide any name like β€œsprinkle-dataset”. Sprinkle will create all tables within this Dataset.

STEP-3: Configure BigQuery Connection

  • Log into Sprinkle application

  • Navigate to Admin -> Warehouse -> New Warehouse Connection

  • Select BigQuery

  • Provide all the mandatory details

    • Distinct Name: Name to identify this connection

    • Project Id: Enter the GCP project ID where your BigQuery instance is created.

    • Private JSON key: Copy and paste the contents of the JSON key file downloaded during service account creation. (STEP-1)

    • Dataset: Specify the name of the BigQuery dataset you want to use (created in STEP-2 above). Datasets are top-level containers that organize and control access to your tables within BigQuery.

    • Advanced Settings (Optional):

      • Maximum Error Count: This optional allows you to define a threshold for errors encountered during data load operations. If the number of errors returned by the load process exceeds the specified Maximum Error Count, the load will fail. Conversely, if the error count stays below the threshold, the load will continue and return an informational message detailing the number of rows that failed to load due to formatting errors or other data inconsistencies.

  • Test Connection

  • Create

Create Cloud Bucket

Sprinkle requires a Cloud Bucket to store intermediate data and report caches. Follow the below steps to create and configure cloud bucket:

STEP-1: Create a Cloud bucket

Create a Cloud bucket in the same GCP project, provide any name like β€œsprinkle” in the same location/region as your BigQuery project.

STEP-2: Provide Cloud Bucket access to Service Account Storage

This bucket should be accessible by BigQuery as well as Sprinkle application. So configure the access for the service account (created for BigQuery above)

Bucket -> Add Permissions -> Add Principal (provide the name of service account created in Bigquery setup above) -> Add Role Storage Admin

STEP-3: Configure GCP Cloud bucket connection in Sprinkle

  • Log into Sprinkle application

  • Navigate to Admin -> Warehouse -> New Warehouse Connection -> Add Storage

  • Select GCP

  • Provide all the mandatory details

    • Distinct Name: Name to identify this connection

    • Private Key JSON: Copy paste the contents of Json key downloaded from the service account created in BigQuery setup

    • Bucket Name: Name of the bucket created above

  • Test Connection

  • Create

Last updated