☄️Data Imports

Guide to integrate your Data Sources with Sprinkle

The basics

Your data may reside in various systems. Sprinkle helps you bring all the data together by ingesting it into your data warehouse.

Data imports help you replicate data from a different system into your cloud data warehouse.

Data Imports is a scheduled pipeline designed to replicate data from different sources to your cloud data warehouse, providing a centralized platform for creating and monitoring your data ingestion pipeline.

When setting up a Data Import, it's crucial to understand the following concepts:

  • Connection: This entails the source endpoint details for your source systems. Save a connection during the 'Establish Connection' step and utilize it to configure different Data Import pipelines.

  • Dataset: A single data source typically comprises multiple datasets. Each table you wish to replicate is configured as a dataset. During the 'Select Datasets' step, you can add datasets to your Data Import pipeline.

📢 With Data Imports, you can effortlessly define source endpoints, select the tables (datasets) for ingestion, and then run, schedule, and monitor the ingestion process.

Watch Video 📺

🛠️ Steps to set up a Data Import Pipeline

  • Click on 'Ingest' in the left navigation menu and navigate to 'Data Imports.'

  • Click on the '+ Setup Sources' button to create a new Data Import.

  • Select the source type to begin. Sprinkle supports ingestion from 100+ sources, including databases, files, events, and applications (marketing, CRM, etc.).

The journey to set up a Data Import consists of three steps: 'Establish Connection,' 'Select Datasets,' and 'Run & Schedule'.

The progress can be tracked from the top progress/status header.

1️ Establish Connection

Depending on the type of source selected above, in this step, you can provide and test the connection endpoints.

You can create a new connection or use a saved connection. Fill in the endpoints and click on 'Test Connection' to check if the connection can be established with the endpoints provided.

  • Test Connection: It checks if a connection can be established and displays the status: failed or passed.

  • Test & Save: Tests the connection and saves the endpoints. Required before proceeding to the next step.

(When the connection endpoint is saved and the Test Connection status is passed, you can proceed to the next step.)

2️ Select Datasets

Here you can select all the datasets (tables) that you want to be included in the ingestion. Refer to individual data source pages in the following categories databases, Files, Events, and Applications (marketing, CRM, etc.) to know about the datasets (tables) supported for ingestion.

On selecting at least one dataset (table) to ingest from the source, you can move to the next step Run & Schedule.

3️ Run & Schedule

In this final step, you can run the Data Import job and schedule runs as needed.

  • Run Now: Instantly pushes the job to the queue for immediate execution.

  • Autorun: Enable the Autorun button to schedule the Ingest as needed. The run frequency can be selected from multiple options, ranging from real-time to monthly.

After running the job, the job table appears below, displaying details such as tables ingested, time taken, number of records, bad records, and more.

Last updated