Sprinkle Docs
Datasources Overview
Guide to integrate your Data Sources with Sprinkle

The basics

Your data may be lying in different systems. Sprinkle helps you bring all data together by ingesting into your data warehouse.
A datasource helps you replicate data from a different system into your cloud data warehouse.
Its important to know following concepts when setting up a datasource:
  • Connection: Source endpoints details. Can be shared in multiple datasources.
  • Datasource: A Scheduled pipeline which replicates data from MySQL to your data warehouse. Here you can define, which tables to replicate, frequency etc. A connection can be shared by multiple datasources.
  • Dataset: A single datasource typically has multiple datasets. Each table that you want to replicate, is configured as a dataset.
Since scheduling is at datasource level. Incase you want different source tables to replicate at different frequency, you can group together different tables in multiple datasources, all sharing the same connection.
You can see the live stats about the number of rows and data size being replicated.
TODO: put the screenshot

Transforming Data

Sprinkle follows modern ELT approach. The data is transformed after arriving into your data warehouse. This decouples the transformation logic from data ingestion, providing you agility to change the transformation logic easily and independently. Also you have both raw and as well as derived tables in your data warehouse, providing you the central data lake/warehouse which can be used in other tools and for data science purpose as well.
Learn more about Transformations here.

Monitoring Ingestion