Azure Cosmos DB

Guide to integrate your Azure Cosmos DB with Sprinkle

Datasource Concepts

Before setting up the datasource, learn about datasource concepts here.

To learn about Connection, refer here.

Log into the Sprinkle application.
Navigate to Datasources -> Connections Tab -> New Connection
Select CosmosDB
Provide all the mandatory details
- Name: Name to identify this connection
- Account Endpoint: Provide the URL in the following format:
https://xxxxxxxxxx.documents.azure.com:443/
- Master Key
Test Connection
Create

To learn about datasource, refer here

Navigate to Datasources -> Datasources Tab -> Add ->
Select CosmosDB
Provide the name -> Create
Connection Tab:
- From the drop-down, select the name of connection created in STEP-2
- Update

Datasets Tab: To learn about Dataset, refer here.

Add Dataset for each collection that you want to replicate, providing following details:

Database Id (Required)
Collection Id (Required)
Ingestion Mode: (Required)
- Complete: Ingest full data from the source table in every ingestion job run. Choose this option if your table size is small (<1 million rows) and you want to ingest it infrequently (few times a day)
- Incremental: Ingest only the changed or inserted rows in every ingestion job run. Choose this option if your table size is large and you want to ingest in realtime mode. Requires Unique Id
  - Unique key (Required)
  To Know more about Ingestion Modes, refer here
Automatic Schema (Required):
- Yes: Schema is automatically discovered by Sprinkle (Recommended)
- No: Hive Schema to be provided Format for Hive schema is : Col1 datatype, Col2 datatype,Col3 datatype Datatype should be warehouse specific.
Date Type: Ingestion runs from this start date/days. If Incremental, then only first run pulls from this date, further runs only pulls changes/new rows.
- Start Date: Provide in the Format:YYYY-MM-DD
- No of days
Destination Schema (Required) : Data warehouse schema where the table will be ingested into
Destination Table name (Required) : It is the table name to be created in the warehouse. If not given, sprinkle will create like ds_<datasourcename>_<tablename>
Destination Create Table Clause: Provide additional clauses to warehouse-create table queries such as clustering, partitioning, and more, useful for optimizing DML statements. Learn more on how to use this field.
Create

In the Ingestion Jobs tab:

Last updated 1 year ago