Guide to integrate your Data Sources with Sprinkle
Your data may be lying in different systems. Sprinkle helps you bring all data together by ingesting it into your data warehouse.
Data Import helps you replicate data from a different system into your cloud data warehouse. Data Imports is a scheduled pipeline that replicates data from different sources of data to your data warehouse. Data Imports makes it easy to create and monitor your Data Ingestion Pipeline from one place.
It's important to know the following concepts when setting up a Data Import:
- Connection: It is the source endpoint details for your source systems. Save a connection during the Establish Connection step and use it to set up different Data Imports pipelines.
- Dataset: A single data source typically has multiple datasets. Each table that you want to replicate, is configured as a dataset. You can add datasets to your Data Import pipeline during the Select Datasets step.
📢 With Data Imports you can easily define the source endpoints, select the tables (datasets) you want to ingest, then Run, Schedule & monitor Ingestion.
Data Imports : Explanation & Feature Walkthrough
Click on Ingest on the left navigation menu. Browse to Data Imports.
Click on the Setup Sources button, to create a new Data Import. Select the source type to get started. Sprinkle supports Ingestion from 100+ sources, like databases, Files, Events, and Applications (marketing, CRM, etc.)
The journey to set up a Data Import has 3 steps, Establish Connection, Select Datasets, Run & Schedule.
The progress can be tracked from the top progress/status header.
Depending on the type of source selected above, In this step, you can provide and test the connection endpoints.
You can create a new connection or use a saved connection. Fill in the endpoints and click on Test Connection, to check if the connection can be established with the endpoints provided.
Test Connection: It tests if a connection can be established with the provided endpoint. Clicking the button also shows the status of the connection, Failed or Passed.
Test & Save: This button tests the connection as well as saves the endpoints provided. This is required before proceeding to the next step. When the connection endpoint is saved and the Test Connection status is Passed, you can proceed to the next step.
Here you can select all the datasets (tables) that you want to be included in the ingestion. Refer to individual data source pages in the following categories databases, Files, Events, and Applications (marketing, CRM, etc.) to know about the datasets (tables) supported for ingestion.
On selecting at least one dataset (table) to ingest from the source, you can move to the next step Run & Schedule.
This is the final step where you can Run the Data Import job and Schedule the Runs as per requirement.
Run Now: Instantly pushes the Job to the queue to be run.
Autorun: Enable the Autorun button, to schedule the Ingest as per requirement. The run frequency can be selected from multiple options ranging from Real-time to Monthly.
Once the Job has been Run, the Job table appears below. The table shows the details related to each run, the tables ingested, the time taken, the number of records, bad records and more.