Sprinkle Docs

Datasources Overview

Guide to integrate your Data Sources with Sprinkle
📢 Note: The Datasources feature on Sprinkle has been revamped to Data Imports​

The basics

Your data may be lying in different systems. Sprinkle helps you bring all data together by ingesting it into your data warehouse.
A Datasource helps you replicate data from a different system into your cloud data warehouse.
It's important to know the following concepts when setting up a Datasource:
  • Connection: Source endpoints details. Can be shared in multiple Datasources.
  • Datasource: A Scheduled pipeline that replicates data from MySQL to your data warehouse. Here you can define, which tables to replicate, frequency etc. A connection can be shared by multiple datasources.
  • Dataset: A single datasource typically has multiple datasets. Each table that you want to replicate, is configured as a dataset.
Since scheduling is at datasource level. In case you want different source tables to replicate at a different frequency, you can group together different tables in multiple datasources, all sharing the same connection.
You can see the live stats about the number of rows and data size being replicated.
Live monitoring of ingestion

Transforming Data

Sprinkle follows the modern ELT approach. The data is transformed after arriving in your data warehouse. This decouples the transformation logic from data ingestion, providing you with the agility to change the transformation logic easily and independently. Also, you have both raw and as well as derived tables in your data warehouse, providing you with the central data lake/warehouse which can be used in other tools and for data science purposes as well.
Learn more about Transformations here.