Sprinkle Docs
  • What is Sprinkle?
  • Quick Start
  • Analysing your data
    • 🔭Analytics Overview
    • 💠Data Models
      • *️Variables
      • 🌲Hierarchies
      • 🤿Column Mask
    • 🎉Switch to New Reports & Dashboards
    • 🆕Reports
      • Overview
      • Build Using Tables
        • Create a new Report
        • Layout and options
        • Build and Format - Overview
        • Apply Row Limits
        • Identify Date Columns
        • Filter your data
        • Visualizations
          • Table
          • Pivot
          • Line Chart
          • Bar Chart
          • Column Chart
          • Area Chart
          • Combo Chart
          • Scatter & Bubble Plot
          • Pie Chart
          • Funnel Chart
          • Stat Card
          • Point Map
          • Heat Map
          • Radial gauge chart
        • Advanced Features
          • Custom Analysis
          • Variables
          • Table & Quick Calculations
          • Drill - Hierarchical & Date
          • Break Out
          • RLS in Table reports
          • Scheduled Exports
          • Embedding Table Reports
      • Build Using Models
        • Create a new report
        • Layout and options
        • Visualizations
        • Advanced Features
      • Build SQL Reports
        • Create a new Report
        • Layout and options
        • Writing a SQL Code on Editor
        • Visualizations
        • Variables in SQL Reports
    • 🆕Dashboards
      • 🌀Filters
      • 👆Click Behaviour
      • ⏰Data Alerts
      • 🗓️Date Drill
      • 📤Scheduled Exports
      • 🔗Embed link
      • 🖥️Dashboard layout
      • 📱Mobile Dashboards
  • Transforming your data
    • 🔰SQL Transform
    • 📓Python Notebooks
  • Integrating your data
    • ☁️Destination Warehouses
      • AWS Athena
        • Manage storage of Flow tables
      • AWS Redshift
      • Azure Synapse
      • Databricks
      • Google BigQuery
      • MySQL
      • Postgres
      • Snowflake
      • SQL Server
      • K8 Setup
        • AWS EKS
        • Google GKE
        • Azure AKS
    • ⚙️Warehouse & Storage Setup
  • Ingesting your data
    • ☄️Data Imports
      • Databases
        • Azure Cosmos DB
        • Azure Table Storage
        • Google BigQuery
        • Mongo DB
        • MySQL DB
        • Oracle DB
        • Postgres DB
        • SQL Server DB
        • Features
          • Ingestion Modes
          • Add Multiple Datasets
          • CDC Setup
            • CDC setup in Mysql
            • CDC setup in Postgres
            • CDC setup in Mongo
            • CDC setup in SQL Server
          • Destination Create Table Clause
          • SSH Tunnel Setup
      • Files
        • AWS S3
        • AWS S3 External
        • Azure Blob
        • FTP
        • Google Cloud Storage
        • Google Sheet
        • SFTP
      • Applications
        • Apple Search Ads
        • Appsflyer
        • Branch
        • Clevertap
        • Facebook Ads
        • Freshdesk
        • Freshsales
        • Google Ads
        • Google Ads V2
        • Google Analytics
        • Google Analytics 4
        • Google Analytics MCF
        • Google Search Console
        • Hubspot
        • Impact Ads
        • Intercom
        • Klaviyo
        • Leadsquared
        • LinkedIn Ads
        • Magento
        • Mailchimp
        • Marketo
        • Mixpanel
        • MoEngage
        • Rocketlane
        • Salesforce
        • SAP S4
        • Shopify
        • Snapchat Marketing
        • TikTok Ads
        • WooCommerce
        • Zendesk Chat
        • Zendesk Support
        • Zoho Analytics
        • Zoho Books
        • Zoho CRM
        • Zoho Desk
        • Zoho Invoice
        • Zoho Subscription
      • Events
        • Apache Kafka
        • AWS Kinesis
        • Azure EventHub
    • 📤File Uploads
    • 🤖API Pulls
    • 🕸️Webhooks
  • Collaborating on data
    • 📤Sharing
    • 💬Comments
    • ⚡Activity
    • 🏷️Labels
  • Managing Schedules and Data Refreshes
    • ⏱️Schedules
    • 🔔Notifications
  • User Management
    • 🔑Access Management
    • 🧑‍🤝‍🧑Groups
    • 📂Folders
    • 🔄Syncing users, groups and RLS
    • 📧Azure AD Integration
  • Data Security & Privacy
    • 🔐Security at Sprinkle
    • 📄GDPR
    • 📄Privacy Policy
  • Release Notes
    • 📢Release Notes
      • 🗒️Release Notes - v12.1 (New)
      • 🗒️Release Notes - v12.0
      • 🗒️Release Notes - v11.0
      • 🗒️Release Notes - v10.8
      • 🗒️Release Notes - v10.7
      • 🗒️Release Notes - v10.6
      • 🗒️Release Notes - v10.5
      • 🗒️Release Notes - v10.4
      • 🗒️Release Notes - v10.3
      • 🗒️Release Notes - v10.2
      • 🗒️Release Notes - v10.1
      • 🗒️Release Notes - v10.0
      • 🗒️Release Notes - v9.31
      • 🗒️Release Notes - v9.30
      • 🗒️Release Notes - v9.29
      • 🗒️Release Notes - v9.28
      • 🗒️Release Notes - v9.27
      • 🗒️Release Notes - v9.25
      • 🗒️Release Notes - v9.24
      • 🗒️Release Notes - v9.23
      • 🗒️Release Notes - v9.22
      • 🗒️Release Notes - v9.21
      • 🗒️Release Notes - v9.20
      • 🗒️Release Notes - v9.19
      • 🗒️Release Notes - v9.18
      • 🗒️Release Notes - v9.17
      • 🗒️Release Notes - v9.16
      • 🗒️Release Notes - v9.14
      • 🗒️Release Notes - v9.13
      • 🗒️Release Notes - v9.12
      • 🗒️Release Notes -v9.8
      • 🗒️Release Notes - v9.7
      • 🗒️Release Notes - v9.6
      • 🗒️Release Notes - v9.5
      • 🗒️Release Notes - v9.4
      • 🗒️Release Notes - v9.3
      • 🗒️Release Notes - v9.2
      • 🗒️Release Notes - v9.1
      • 🗒️Release Notes - v9.0 (Major)
      • 🗒️Release Notes - v7.23
      • 🗒️Release Notes - v7.21
      • 🗒️Release Notes - v7.20
      • 🗒️Release Notes - v7.15
      • 🗒️Release Notes - v7.14
      • 🗒️Release Notes - v7.13
Powered by GitBook
On this page
  • If you don't have an existing Azure Synapse cluster
  • Create a serverless SQL pool while setting up the warehouse
  • External data source creation
  1. Integrating your data
  2. Destination Warehouses

Azure Synapse

Guide to integrate your Azure Synapse with Sprinkle

PreviousAWS RedshiftNextDatabricks

Last updated 3 months ago

This page covers the details about integrating Azure Synapse with Sprinkle.

When setting up a Synapse connection, Sprinkle additionally requires Azure Storage Container. This guide covers the role of all the components and the steps to set up.

  • Integrating Synapse: All analytical data is stored and queried fromthe Synapse warehouse

  • Create Azure Storage Container: Sprinkle stores all intermediate data and report caches in this bucket

Step-by-step Guide for intergrating Synapse

STEP-1: Allow Synapse to accept connection from Sprinkle

Allow inbound connection on Synapse jdbc port (default is 1433) from Sprinkle IPs (34.93.254.126, 34.93.106.136).

STEP-2: Configure Synapse Connection

  • Log into Sprinkle application

  • Navigate to Admin -> Warehouse -> New Warehouse Connection

  • Select Azure Synapse

  • Provide all the mandatory details

    • Distinct Name: Name to identify this connection

    • Host: Provide IP address or Host name.

    • Port: Provide the Port number.

    • Database: Provide database name if there is any, it should be an existing database.

    • Username

    • Password

    • Server Type: You choose between Dedicated or Serverless.

      • External Data Source: This is only required for Serverless Server Type

    • Advanced Settings: It can be turned on/off based on the requirements. if yes:

      • Maximum Error Count: If the load returns the error_count number of errors or greater, the load fails. If the load returns fewer errors, it continues and returns an INFO message that states the number of rows that could not be loaded. Use this parameter to allow loads to continue when certain rows fail to load into the table because of formatting errors or other inconsistencies in the data. For reference,

      • Minimum Idle Connections: This property controls the minimum number of idle connections in the pool.

      • Maximum Pool Size: This property controls the maximum size that the pool is allowed to reach.

  • Test Connection

  • Create

Create Azure Storage Container

Create Azure Storage Container

Sprinkle requires an Azure Storage container to store intermediate data and report caches.

Note: If the warehouse is Azure Synapse, the Date Lake Storage which is configured for Synapse Analytics(during creation) should be used here.

Follow the below steps to create and configure storage container:

STEP-1: Create a Storage Account

  1. Login to the Azure Portal and select All Services.

  2. Select Storage Account and choose Add.

  3. Select the subscription in which you want to create the account.

  4. Select the desired resource group or create a new resource group. You can refer to the of resource management.

  5. Provide the storage name, this name should be unique across all the Azure storages. The name also must be between 3 and 24 characters in length, and may include only numbers and lowercase letters.

  6. Select a location for your storage account, or use the default location.

  7. Select a performance tier, recommended is Standard.

  8. Set the Account kind field to Storage V2 (general-purpose v2).

  9. Specify other settings if anything is required

  10. Select Review + Create to review your storage account settings and create the account.

  11. Click on Create, it will create the account

STEP-2: Create Storage Container

  1. Login and go to the new storage account in the Azure portal.

  2. In the left menu for the storage account, select Containers from the Blob service section.

  3. Click on the + Container button.

  4. Provide a name for your new container. The container name must be lowercase, must start with a letter or number, and can include only letters, numbers, and the dash (-) character. Refer to the for more information.

  5. Set the level of public access to the container or you can whitelist the Sprinkle IP shown in Driver Setup.

  6. Select OK to create the container.

You can refer to the for Storage containers.

STEP-3: Get Access Key

  1. Navigate to your storage account in the .

  2. Under Settings, select Access keys. Your account access keys appear, as well as the complete connection string for each key.

  3. Locate the Key value under key1, and click the Copy button to copy the account key.

STEP-4: Configure Azure Storage connection in Sprinkle

  • Log into Sprinkle application

  • Navigate to Admin -> Warehouse -> New Warehouse Connection -> Add Storage

  • Select Azure Blob Store

  • Provide all the mandatory details

    • Distinct Name: Name to identify this connection

    • Storage Account Name: Created in STEP-1

    • Container Name: Created in STEP-2

    • Access Key: Copied in STEP-3

  • Test Connection

  • Create

Get JDBC URL

On the go to synapse workspace, then on the left side under the Analytics pools tab select sql-pools.

  • Select the sql-pool for which you want to integrate to sprinkle.

  • On the left side under the Settings tab select connection strings. Select JDBC, here copy JDBC URL(remove username and password from JDBC URL).

If you don't have an existing Azure Synapse cluster

You can refer to the following/refer .

Create an Azure Synapse Analytics workspace
  1. Open the , and at the top, search for Synapse.

  2. In the search results, under Services, select Azure Synapse Analytics.

  3. Select Create to create a workspace.

  4. On the Basics tab, give the workspace a unique name. We use mysworkspace in this document.

  5. You need an Azure Data Lake Storage Gen2 account to create a workspace. The simplest choice is to create a new one. If you want to reuse an existing one, you need to perform extra configuration:

    • Option 1: Create a new Data Lake Storage Gen2 account:

      1. Under Select Data Lake Storage Gen 2 > Account Name, select Create New. Provide a global unique name, such as contosolake.

      2. Under Select Data Lake Storage Gen 2 > File system name, select File System and name it users.

    • Option 2: See the instructions in .

  6. Your Azure Synapse Analytics workspace uses this storage account as the primary storage account and the container to store workspace data. The workspace stores data in Apache Spark tables. It stores Spark application logs under a folder named /synapse/workspacename.

  7. Select Review + create > Create. Your workspace is ready in a few minutes.

Open Synapse Studio

After your Azure Synapse Analytics workspace is created, you have two ways to open Synapse Studio:

  • Open your Synapse workspace in the . At the top of the Overview section, select Launch Synapse Studio.

  • Go to and sign in to your workspace.

Prepare an existing storage account for use with Azure Synapse Analytics

  1. Open the .

  2. Go to an existing Data Lake Storage Gen2 storage account.

  3. Select Access control (IAM).

  4. Select Add > Add role assignment to open the Add role assignment page.

  5. Assign the following role. For more information, see .

    Expand table

    Setting
    Value

    Role

    Owner and Storage Blob Data Owner

    Assign access to

    USER

    Members

    Your user name

    Screenshot that shows the Add role assignment page in Azure portal.
  6. On the left pane, select Containers and create a container.

  7. You can give the container any name. In this document, we name the container users.

  8. Accept the default setting Public access level, and then select Create.

Configure access to the storage account from your workspace

Managed identities for your Azure Synapse Analytics workspace might already have access to the storage account. Follow these steps to make sure:

  1. Open the and the primary storage account chosen for your workspace.

  2. Select Access control (IAM).

  3. Select Add > Add role assignment to open the Add role assignment page.

  4. Assign the following role. For more information, see .

    Setting
    Value

    Role

    Storage Blob Data Contributor

    Assign access to

    MANAGEDIDENTITY

    Members

    myworkspace

    Note: The managed identity name is also the workspace name.

    Screenshot that shows the Add role assignment pane in the Azure portal.
  5. Select Save.

Create a serverless SQL pool while setting up the warehouse

Create a new serverless Apache Spark pool using the Azure portal
  • Azure Synapse Analytics offers various analytics engines to help you ingest, transform, model, analyze, and distribute your data.

  • An Apache Spark pool provides open-source big data computing capabilities.

  • After you create an Apache Spark pool in your Synapse workspace, data can be loaded, modelled, processed, and distributed for faster analytic insight.

Here, you'll learn how to use the Azure portal to create an Apache Spark pool in a Synapse workspace.

Prerequisites

Sign in to the Azure portal

Navigate to the Synapse workspace

  1. Navigate to the Synapse workspace where the Apache Spark pool will be created by typing the service name (or resource name directly) into the search bar.

  1. From the list of workspaces, type the name (or part of the name) of the workspace to open. For this example, we use a workspace named contosoanalytics.

Create new Apache Spark pool

  1. In the Synapse workspace where you want to create the Apache Spark pool, select New Apache Spark pool.

  1. Enter the following details in the Basics tab:

Important: There are specific limitations for the names that Apache Spark pools can use. Names must contain letters or numbers only, must be 15 or less characters, must start with a letter, not contain reserved words, and be unique in the workspace.

  1. Select Next: additional settings and review the default settings. Don't modify any default settings.

  1. Select Review + create.

  1. Make sure that the details look correct based on what was previously entered, and select Create.

  1. At this point, the resource provisioning flow will start, indicating once it's complete.

  1. After the provisioning completes, navigating back to the workspace will show a new entry for the newly created Apache Spark pool.

  1. At this point, there are no resources running, no charges for Spark, you have created metadata about the Spark instances you want to create.

External data source creation

Create a new database which will be configured in Sprinkle:

CREATE DATABASE <DATABASE_NAME>; 
  • Need to use credential of Data Lake Storage Gen2 which is select during creation of Synapse Analytics.

  • In Azure Portal on home page select storage then go to Security + Networking then go to Shared Access Signature

  • Now select checkbox for Allowed resource types and provide start and end expiry date

  • Click on Generate SAS and connection string.

  • Copy the SAS token and use this in the below query:

CREATE DATABASE SCOPED CREDENTIAL <SCOPED_NAME>
USE <DATABASE_NAME>; 
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<PASSWORD>';  
WITH IDENTITY = 'SHARED ACCESS SIGNATURE', 
SECRET = ''; 
CREATE EXTERNAL DATA SOURCE <DATASOURCE_NAME> 
WITH 
( LOCATION = 'https://<CONTAINER_NAME>.blob.core.windows.net/<CONTAINER_NAME>', 
CREDENTIAL = <SCOPED_NAME>);

You can refer to the following/refer

You'll need an Azure subscription. If needed,

You'll be using the .

Sign in to the

Select Next: tags. Consider using Azure tags. For example, the "Owner" or "CreatedBy" tag to identify who created the resource, and the "Environment" tag to identify whether this resource is in Production, Development, etc. For more information, see .

☁️
docs
create a free Azure account
Synapse workspace
Azure portal
Develop your naming and tagging strategy for Azure resources
see this
Azure Documentation
Azure Documentation
Azure quick start guide
Azure portal
Azure portal
azure documentation
Azure portal
Prepare an existing storage account for use with Azure Synapse Analytics
Azure portal
Azure Synapse Analytics
Azure portal
Assign Azure roles by using the Azure portal
Azure portal
Assign Azure roles by using the Azure portal