Google GKE
Guide to create GKE cluster and connect to Sprinkle
Sprinkle requires GKE cluster for data ingestion and processing. This is required to process all your data locally within your Google private network.
Follow the below steps to create and configure GKE cluster:
STEP-1: Create GKE Cluster
Create a GKE Standard cluster with all default settings with following configuration:
Networking
Select Public cluster
Enable control plane authorized networks: Provide any name like Sprinkle network and put the CIDR range (to whitelist the Sprinkle IPs), as 34.93.254.126/32
Default Node Pool: 2 nodes
Node Type: n1-standard-1
STEP-2: Generate user token
Sprinkle authenticates to GKE cluster using kubernetes user token. Follow below steps to generate User token:
Install kubectl and gcloud CLI
Generate ~/.kube/config file:
To verify the setup, run kubectl command to fetch running nodes:
Create namespace
Create Admin User In kubernetes: Create file service-account-create.yml:
Create ClusterRoleBinding: create a file role-binding.yml:
To create a long-lived API token for a ServiceAccount, you create a new secret file sprinkle-admin-secret.yml with a special annotation,
kubernetes.io/service-account.name
:
User token Token will be printed by this command, note down the generated token:
STEP-3: Configure GKE connection
Log into Sprinkle application
Navigate to Admin -> Drivers -> Create Compute
Select GKE
Provide all the mandatory details
Distinct Name: Any name to identify the connection
Cluster Url: Provide the url of the GKE created in the format https://<ENDPOINT>
Cluster CA Certificate: Provide Cluster CA certificate of the GKE cluster
Is Certificate Encoded: No
User Token: Paste the User Token generated in STEP-2 above
Deploy namespace: sprinkle
Supported Kernels : Enter name of supported kernels. For more than one kernel, separate them by comma
CPU and VM size for Notebook : No of CPU and size of each VM. If more than one entry then separate them by comma.
CPU and VM size for Ingestion : No of CPU and size of each VM.
Node group labels : Key value pair separated by comma, ex:- key1: val1, key2: val2, key3: val3. If label is configured then all pods will be launched only on the node group having the same label.
Advance Settings : Yes
Notebook Idle Timeout : Time in minutes
Notebook Docker Container url : Url of the Jupyter Notebook docker container, available on dockerhub. See available images here
Notebook: (If you want to use a Notebook for transform, make Notebook enable.) Yes or No
Test Connection
Create
Checking Sprinkle Job Logs
Open google cloud platform -> https://console.cloud.google.com/
Search for "Logging".
On the left panel click on Logs Explorer.
From the Query panel
Click on Resource -> Select Kubernetes Container -> Choose cluster name -> Choose Namespace
In the Search container name box, put the sprinkle job id. (To find the sprinkle job id, click on "Show details" link in the sprinkle UI for that job)
Run Query.
Checks for logs.
Last updated