Sprinkle Docs
  • What is Sprinkle?
  • Quick Start
  • Analysing your data
    • 🔭Analytics Overview
    • 💠Data Models
      • *️Variables
      • 🌲Hierarchies
      • 🤿Column Mask
    • 🎉Switch to New Reports & Dashboards
    • 🆕Reports
      • Overview
      • Build Using Tables
        • Create a new Report
        • Layout and options
        • Build and Format - Overview
        • Apply Row Limits
        • Identify Date Columns
        • Filter your data
        • Visualizations
          • Table
          • Pivot
          • Line Chart
          • Bar Chart
          • Column Chart
          • Area Chart
          • Combo Chart
          • Scatter & Bubble Plot
          • Pie Chart
          • Funnel Chart
          • Stat Card
          • Point Map
          • Heat Map
          • Radial gauge chart
        • Advanced Features
          • Custom Analysis
          • Variables
          • Table & Quick Calculations
          • Drill - Hierarchical & Date
          • Break Out
          • RLS in Table reports
          • Scheduled Exports
          • Embedding Table Reports
      • Build Using Models
        • Create a new report
        • Layout and options
        • Visualizations
        • Advanced Features
      • Build SQL Reports
        • Create a new Report
        • Layout and options
        • Writing a SQL Code on Editor
        • Visualizations
        • Variables in SQL Reports
    • 🆕Dashboards
      • 🌀Filters
      • 👆Click Behaviour
      • ⏰Data Alerts
      • 🗓️Date Drill
      • 📤Scheduled Exports
      • 🔗Embed link
      • 🖥️Dashboard layout
      • 📱Mobile Dashboards
  • Transforming your data
    • 🔰SQL Transform
    • 📓Python Notebooks
  • Integrating your data
    • ☁️Destination Warehouses
      • AWS Athena
        • Manage storage of Flow tables
      • AWS Redshift
      • Azure Synapse
      • Databricks
      • Google BigQuery
      • MySQL
      • Postgres
      • Snowflake
      • SQL Server
      • K8 Setup
        • AWS EKS
        • Google GKE
        • Azure AKS
    • ⚙️Warehouse & Storage Setup
  • Ingesting your data
    • ☄️Data Imports
      • Databases
        • Azure Cosmos DB
        • Azure Table Storage
        • Google BigQuery
        • Mongo DB
        • MySQL DB
        • Oracle DB
        • Postgres DB
        • SQL Server DB
        • Features
          • Ingestion Modes
          • Add Multiple Datasets
          • CDC Setup
            • CDC setup in Mysql
            • CDC setup in Postgres
            • CDC setup in Mongo
            • CDC setup in SQL Server
          • Destination Create Table Clause
          • SSH Tunnel Setup
      • Files
        • AWS S3
        • AWS S3 External
        • Azure Blob
        • FTP
        • Google Cloud Storage
        • Google Sheet
        • SFTP
      • Applications
        • Apple Search Ads
        • Appsflyer
        • Branch
        • Clevertap
        • Facebook Ads
        • Freshdesk
        • Freshsales
        • Google Ads
        • Google Ads V2
        • Google Analytics
        • Google Analytics 4
        • Google Analytics MCF
        • Google Search Console
        • Hubspot
        • Impact Ads
        • Intercom
        • Klaviyo
        • Leadsquared
        • LinkedIn Ads
        • Magento
        • Mailchimp
        • Marketo
        • Mixpanel
        • MoEngage
        • Rocketlane
        • Salesforce
        • SAP S4
        • Shopify
        • Snapchat Marketing
        • TikTok Ads
        • WooCommerce
        • Zendesk Chat
        • Zendesk Support
        • Zoho Analytics
        • Zoho Books
        • Zoho CRM
        • Zoho Desk
        • Zoho Invoice
        • Zoho Subscription
      • Events
        • Apache Kafka
        • AWS Kinesis
        • Azure EventHub
    • 📤File Uploads
    • 🤖API Pulls
    • 🕸️Webhooks
  • Collaborating on data
    • 📤Sharing
    • 💬Comments
    • ⚡Activity
    • 🏷️Labels
  • Managing Schedules and Data Refreshes
    • ⏱️Schedules
    • 🔔Notifications
  • User Management
    • 🔑Access Management
    • 🧑‍🤝‍🧑Groups
    • 📂Folders
    • 🔄Syncing users, groups and RLS
    • 📧Azure AD Integration
  • Data Security & Privacy
    • 🔐Security at Sprinkle
    • 📄GDPR
    • 📄Privacy Policy
  • Release Notes
    • 📢Release Notes
      • 🗒️Release Notes - v12.1 (New)
      • 🗒️Release Notes - v12.0
      • 🗒️Release Notes - v11.0
      • 🗒️Release Notes - v10.8
      • 🗒️Release Notes - v10.7
      • 🗒️Release Notes - v10.6
      • 🗒️Release Notes - v10.5
      • 🗒️Release Notes - v10.4
      • 🗒️Release Notes - v10.3
      • 🗒️Release Notes - v10.2
      • 🗒️Release Notes - v10.1
      • 🗒️Release Notes - v10.0
      • 🗒️Release Notes - v9.31
      • 🗒️Release Notes - v9.30
      • 🗒️Release Notes - v9.29
      • 🗒️Release Notes - v9.28
      • 🗒️Release Notes - v9.27
      • 🗒️Release Notes - v9.25
      • 🗒️Release Notes - v9.24
      • 🗒️Release Notes - v9.23
      • 🗒️Release Notes - v9.22
      • 🗒️Release Notes - v9.21
      • 🗒️Release Notes - v9.20
      • 🗒️Release Notes - v9.19
      • 🗒️Release Notes - v9.18
      • 🗒️Release Notes - v9.17
      • 🗒️Release Notes - v9.16
      • 🗒️Release Notes - v9.14
      • 🗒️Release Notes - v9.13
      • 🗒️Release Notes - v9.12
      • 🗒️Release Notes -v9.8
      • 🗒️Release Notes - v9.7
      • 🗒️Release Notes - v9.6
      • 🗒️Release Notes - v9.5
      • 🗒️Release Notes - v9.4
      • 🗒️Release Notes - v9.3
      • 🗒️Release Notes - v9.2
      • 🗒️Release Notes - v9.1
      • 🗒️Release Notes - v9.0 (Major)
      • 🗒️Release Notes - v7.23
      • 🗒️Release Notes - v7.21
      • 🗒️Release Notes - v7.20
      • 🗒️Release Notes - v7.15
      • 🗒️Release Notes - v7.14
      • 🗒️Release Notes - v7.13
Powered by GitBook
On this page
  • Datasource Concepts
  • Step by Step Guide
  • STEP-1: Configure Azure Blob Connection
  • STEP-2: Configure Azure Blob datasource
  • STEP-3: Create Dataset
  • STEP-4: Run and schedule Ingestion
  1. Ingesting your data
  2. Data Imports
  3. Files

Azure Blob

Guide to integrate your Azure Blob Storage to Sprinkle

PreviousAWS S3 ExternalNextFTP

Last updated 1 year ago

Datasource Concepts

Before setting up the datasource, learn about datasource concepts

Step by Step Guide

STEP-1: Configure Azure Blob Connection

To learn about Connection, refer

  • Log into Sprinkle application

  • Navigate to Datasources -> Connections Tab -> New Connection ->

  • Select Azure Blob

  • Provide all the mandatory details

    • Name: Name to identify this connection

    • Access Key: Login to Azure dashboard, and copy the key from Storage Accounts -> Access keys view.

    • Storage Account Name: Name of the storage account

    • Container Name: The container created inside the storage account

  • Test Connection

  • Create

STEP-2: Configure Azure Blob datasource

  • Navigate to Datasources -> Datasources Tab -> Add ->

  • Select Azure Blob

  • Provide the name -> Create

  • Connection Tab:

    • From the drop-down, select the name of connection created in STEP-2

    • Update

STEP-3: Create Dataset

  • File Type: Select the File Format

    • JSON

    • CSV

      • Select Delimiter - Comma, Tab, Pipe, Dash, Other Character

    • Parquet

    • ORC

  • Ingestion Mode (Required) :

    • Complete: Full folder is downloaded and ingested in every ingestion job run

    • Incremental: Ingest only the new files in every ingestion job run. Use this option if your folder is very large, and you are getting new files continuously

      • Remove Duplicate Rows:

        • Unique Key: Unique key from table, to dedup data across multiple ingestions

        • Time Column Name: Will be used to order data for deduping

      • Max Job Runtime: Give maximum time in minutes for which data should be downloaded. Ingestion job will run specified max minutes and checkpoint will be updated. Next run will continue from checkpoint.

  • Flatten Level (Required) : Select One Level or Multi Level. In one level, flattening will not be applied on complex type. They will be stored as string. In multi level, flattening will be applied in complex level till they become simple type.

  • Destination Schema (Required) : Data warehouse schema where the table will be ingested into

  • Destination Table name (Required) : It is the table name to be created on the warehouse. If not given, sprinkle will create like ds_<datasourcename>_<tablename>

  • Create

STEP-4: Run and schedule Ingestion

In the Ingestion Jobs tab:

  • Trigger the Job, using Run button

  • To schedule, enable Auto-Run. Change the frequency if needed

To learn about datasource, refer

Datasets Tab: To learn about Dataset, refer . Add Dataset for each folder that you want to replicate, providing following details

Directory Path (Required) :Provide the full path like this: [email protected]/testhive/datasource//34a

Destination Create Table Clause: Provide additional clauses to warehouse-create table queries such as clustering, partitioning, and more, useful for optimizing DML statements. on how to use this field.

☄️
here
here
here
here
wasbs://demo-sprinkle
Learn more