Sprinkle Docs
  • What is Sprinkle?
  • Quick Start
  • Analysing your data
    • 🔭Analytics Overview
    • 💠Data Models
      • *️Variables
      • 🌲Hierarchies
      • 🤿Column Mask
    • 🎉Switch to New Reports & Dashboards
    • 🆕Reports
      • Overview
      • Build Using Tables
        • Create a new Report
        • Layout and options
        • Build and Format - Overview
        • Apply Row Limits
        • Identify Date Columns
        • Filter your data
        • Visualizations
          • Table
          • Pivot
          • Line Chart
          • Bar Chart
          • Column Chart
          • Area Chart
          • Combo Chart
          • Scatter & Bubble Plot
          • Pie Chart
          • Funnel Chart
          • Stat Card
          • Point Map
          • Heat Map
          • Radial gauge chart
        • Advanced Features
          • Custom Analysis
          • Variables
          • Table & Quick Calculations
          • Drill - Hierarchical & Date
          • Break Out
          • RLS in Table reports
          • Scheduled Exports
          • Embedding Table Reports
      • Build Using Models
        • Create a new report
        • Layout and options
        • Visualizations
        • Advanced Features
      • Build SQL Reports
        • Create a new Report
        • Layout and options
        • Writing a SQL Code on Editor
        • Visualizations
        • Variables in SQL Reports
    • 🆕Dashboards
      • 🌀Filters
      • 👆Click Behaviour
      • ⏰Data Alerts
      • 🗓️Date Drill
      • 📤Scheduled Exports
      • 🔗Embed link
      • 🖥️Dashboard layout
      • 📱Mobile Dashboards
  • Transforming your data
    • 🔰SQL Transform
    • 📓Python Notebooks
  • Integrating your data
    • ☁️Destination Warehouses
      • AWS Athena
        • Manage storage of Flow tables
      • AWS Redshift
      • Azure Synapse
      • Databricks
      • Google BigQuery
      • MySQL
      • Postgres
      • Snowflake
      • SQL Server
      • K8 Setup
        • AWS EKS
        • Google GKE
        • Azure AKS
    • ⚙️Warehouse & Storage Setup
  • Ingesting your data
    • ☄️Data Imports
      • Databases
        • Azure Cosmos DB
        • Azure Table Storage
        • Google BigQuery
        • Mongo DB
        • MySQL DB
        • Oracle DB
        • Postgres DB
        • SQL Server DB
        • Features
          • Ingestion Modes
          • Add Multiple Datasets
          • CDC Setup
            • CDC setup in Mysql
            • CDC setup in Postgres
            • CDC setup in Mongo
            • CDC setup in SQL Server
          • Destination Create Table Clause
          • SSH Tunnel Setup
      • Files
        • AWS S3
        • AWS S3 External
        • Azure Blob
        • FTP
        • Google Cloud Storage
        • Google Sheet
        • SFTP
      • Applications
        • Apple Search Ads
        • Appsflyer
        • Branch
        • Clevertap
        • Facebook Ads
        • Freshdesk
        • Freshsales
        • Google Ads
        • Google Ads V2
        • Google Analytics
        • Google Analytics 4
        • Google Analytics MCF
        • Google Search Console
        • Hubspot
        • Impact Ads
        • Intercom
        • Klaviyo
        • Leadsquared
        • LinkedIn Ads
        • Magento
        • Mailchimp
        • Marketo
        • Mixpanel
        • MoEngage
        • Rocketlane
        • Salesforce
        • SAP S4
        • Shopify
        • Snapchat Marketing
        • TikTok Ads
        • WooCommerce
        • Zendesk Chat
        • Zendesk Support
        • Zoho Analytics
        • Zoho Books
        • Zoho CRM
        • Zoho Desk
        • Zoho Invoice
        • Zoho Subscription
      • Events
        • Apache Kafka
        • AWS Kinesis
        • Azure EventHub
    • 📤File Uploads
    • 🤖API Pulls
    • 🕸️Webhooks
  • Collaborating on data
    • 📤Sharing
    • 💬Comments
    • ⚡Activity
    • 🏷️Labels
  • Managing Schedules and Data Refreshes
    • ⏱️Schedules
    • 🔔Notifications
  • User Management
    • 🔑Access Management
    • 🧑‍🤝‍🧑Groups
    • 📂Folders
    • 🔄Syncing users, groups and RLS
    • 📧Azure AD Integration
  • Data Security & Privacy
    • 🔐Security at Sprinkle
    • 📄GDPR
    • 📄Privacy Policy
  • Release Notes
    • 📢Release Notes
      • 🗒️Release Notes - v12.1 (New)
      • 🗒️Release Notes - v12.0
      • 🗒️Release Notes - v11.0
      • 🗒️Release Notes - v10.8
      • 🗒️Release Notes - v10.7
      • 🗒️Release Notes - v10.6
      • 🗒️Release Notes - v10.5
      • 🗒️Release Notes - v10.4
      • 🗒️Release Notes - v10.3
      • 🗒️Release Notes - v10.2
      • 🗒️Release Notes - v10.1
      • 🗒️Release Notes - v10.0
      • 🗒️Release Notes - v9.31
      • 🗒️Release Notes - v9.30
      • 🗒️Release Notes - v9.29
      • 🗒️Release Notes - v9.28
      • 🗒️Release Notes - v9.27
      • 🗒️Release Notes - v9.25
      • 🗒️Release Notes - v9.24
      • 🗒️Release Notes - v9.23
      • 🗒️Release Notes - v9.22
      • 🗒️Release Notes - v9.21
      • 🗒️Release Notes - v9.20
      • 🗒️Release Notes - v9.19
      • 🗒️Release Notes - v9.18
      • 🗒️Release Notes - v9.17
      • 🗒️Release Notes - v9.16
      • 🗒️Release Notes - v9.14
      • 🗒️Release Notes - v9.13
      • 🗒️Release Notes - v9.12
      • 🗒️Release Notes -v9.8
      • 🗒️Release Notes - v9.7
      • 🗒️Release Notes - v9.6
      • 🗒️Release Notes - v9.5
      • 🗒️Release Notes - v9.4
      • 🗒️Release Notes - v9.3
      • 🗒️Release Notes - v9.2
      • 🗒️Release Notes - v9.1
      • 🗒️Release Notes - v9.0 (Major)
      • 🗒️Release Notes - v7.23
      • 🗒️Release Notes - v7.21
      • 🗒️Release Notes - v7.20
      • 🗒️Release Notes - v7.15
      • 🗒️Release Notes - v7.14
      • 🗒️Release Notes - v7.13
Powered by GitBook
On this page
  • Step by Step Guide
  • Integrating Databricks
  • Creating Storage Mount
  • Create a Cloud Bucket
  1. Integrating your data
  2. Destination Warehouses

Databricks

Guide to integrate Databricks with Sprinkle

PreviousAzure SynapseNextGoogle BigQuery

Last updated 1 year ago

This page covers the details about integrating Databricks with Sprinkle.

When setting up Databricks connection, Sprinkle additionally requires a Cloud bucket. This guide covers the role of all the components and steps to setup.

  • : All analytical data is stored and queried from Databricks warehouse

  • : Sprinkle stores all intermediate data and report caches in this bucket

Step by Step Guide

Integrating Databricks

STEP-1: Allow Databricks to accept connection from Sprinkle

Allow inbound connection on databricks jdbc port (default is 443) from Sprinkle IPs (34.93.254.126, 34.93.106.136).

STEP-2: Configure Databricks Connection on Sprinkle

To get the connection details for a Databricks , do the following:

  1. Log in to your Databricks workspace.

  2. In the sidebar, click Compute.

  3. In the list of available clusters, click the target cluster’s name.

  4. On the Configuration tab, expand Advanced options.

  5. Click the JDBC/ODBC tab.

  6. Copy the connection details that you need, such as Server Hostname, Port, and HTTP Path.

To get the connection details for a Databricks SQL , do the following:

  1. Log in to your Databricks workspace.

  2. In the sidebar, click SQL > SQL Warehouses.

  3. In the list of available warehouses, click the target warehouse’s name.

  4. On the Connection Details tab, copy the connection details that you need, such as Server hostname, Port, and HTTP path.

  • Log into Sprinkle application

  • Navigate to Admin -> Warehouse -> New Warehouse Connection

  • Select Databricks

  • In the Connect Warehouse form in Sprinkle, provide all the mandatory details

    • Distinct Name: Name to identify this connection

    • Host: Provide the IP address or hostname of your Databricks instance.

    • Port: Provide the port number for your Databricks instance.

    • Database: Provide the name of the specific database you want to connect to within Databricks, if applicable. This should be an existing database

    • HTTP Path: Provide the HTTP path component of your Databricks cluster connection URL. This path identifies the specific Databricks instance you're trying to access.

    • Username: The username (ID) you use to log in to data bricks.

  • Test Connection

  • Create

Creating Storage Mount

Go to Databricks home page and click on the create button on the right side and select notebook. Select the cluster you want to configure with sprinkle and select python as default language.

Run this Python code

Depending on your Cloud, you can create the mount. Sprinkle currently supports Databricks in Azure and AWS clouds.

Azure blob

dbutils.fs.mount(
  source = "wasbs://<container-name>@<storage-account-name>.blob.core.windows.net",
  mount_point = "/mnt/<mount-name>",
  extra_configs = {"fs.azure.account.key.<storage-account-name>.blob.core.windows.net":"<storage key>"})

S3

AccessKey = "<Access_Key>"
SecretKey = "<Secret_Key>"
SecretKey = SecretKey.replace("/", "%2F")
aws_bucket_name = "<Bucket_Name>"
mount_name = "<mount_name>"
dbutils.fs.mount("s3a://%s:%s@%s" % (AccessKey, SecretKey, aws_bucket_name), "/mnt/%s" % mount_name)
display(dbutils.fs.ls("/mnt/%s" % mount_name))

Note:

  1. Storage configured and Storage mount on data bricks should be on the same bucket

  2. Give a unique Storage Mount name and it should not collide with existing mounts. (If path name is /mnt/sprinkle then just mention sprinkle)

  3. Need to set this property "spark.databricks.delta.alterTable.rename.enabledOnAWS" to True in databricks.

Create a Cloud Bucket

Cloud bucket can be created depending on your Databricks Cloud. Sprinkle supports creating a bucket in AWS or Azure. Refer respective documents for creating a configuring the Cloud Bucket.

Password: Personal access token. To generate, see .

Storage Mount Name: Storage that will be used by Databricks. See for more details.

Refer

Refer

☁️
here
https://docs.databricks.com/data/data-sources/azure/azure-storage.html
https://docs.databricks.com/data/data-sources/aws/amazon-s3.html
cluster
warehouse
Integrating Databricks
Cloud Bucket
the section
Create Azure Storage Container
Create S3 Bucket