Google BigQuery

Guide to integrate your BigQuery with Sprinkle

This page covers the details about integrating BigQuery with Sprinkle.

When setting up BigQuery connection, Sprinkle additionally requires a Cloud bucket. This guide covers the role of all the components and steps to setup.

Integrating BigQuery: All analytical data is stored and queried from BigQuery warehouse
Create Cloud Bucket: Sprinkle stores all intermediate data and report caches in this bucket

Step by Step Guide

Integrating BigQuery

STEP-1: Create a Service Account

Create a service account which will be used by Sprinkle to connect to BigQuery

Create a Service Account, provide any name like “sprinkle”.
In the service account, provide permission - BigQuery Admin role
Create a JSON key for this service account, and download it

STEP-2: Create a BigQuery dataset

Create a BigQuery dataset, provide any name like “sprinkle-dataset”. Sprinkle will create all tables within this Dataset.

STEP-3: Configure BigQuery Connection

Log into Sprinkle application
Navigate to Admin -> Warehouse -> New Warehouse Connection
Select BigQuery
Provide all the mandatory details
- Distinct Name: Name to identify this connection
- Project Id: Enter the GCP project ID where your BigQuery instance is created.
- Private JSON key: Copy and paste the contents of the JSON key file downloaded during service account creation. (STEP-1)
- Dataset: Specify the name of the BigQuery dataset you want to use (created in STEP-2 above). Datasets are top-level containers that organize and control access to your tables within BigQuery.
- Advanced Settings (Optional):
  - Maximum Error Count: This optional allows you to define a threshold for errors encountered during data load operations. If the number of errors returned by the load process exceeds the specified Maximum Error Count, the load will fail. Conversely, if the error count stays below the threshold, the load will continue and return an informational message detailing the number of rows that failed to load due to formatting errors or other data inconsistencies.
Test Connection
Create

Create Cloud Bucket

Sprinkle requires a Cloud Bucket to store intermediate data and report caches. Follow the below steps to create and configure cloud bucket:

STEP-1: Create a Cloud bucket

Create a Cloud bucket in the same GCP project, provide any name like “sprinkle” in the same location/region as your BigQuery project.

STEP-2: Provide Cloud Bucket access to Service Account Storage

This bucket should be accessible by BigQuery as well as Sprinkle application. So configure the access for the service account (created for BigQuery above)

Bucket -> Add Permissions -> Add Principal (provide the name of service account created in Bigquery setup above) -> Add Role Storage Admin

STEP-3: Configure GCP Cloud bucket connection in Sprinkle

Log into Sprinkle application
Navigate to Admin -> Warehouse -> New Warehouse Connection -> Add Storage
Select GCP
Provide all the mandatory details
- Distinct Name: Name to identify this connection
- Private Key JSON: Copy paste the contents of Json key downloaded from the service account created in BigQuery setup
- Bucket Name: Name of the bucket created above
Test Connection
Create

PreviousDatabricks NextMySQL

Last updated 1 year ago