Apache Kafka
Guide to integrate your Kafka to Sprinkle
Datasource Concepts
Before setting up the datasource, learn about datasource concepts here
Step by Step Guide
STEP-1: Configure Kafka Connection
To learn about Connection, refer here
Log into Sprinkle application
Navigate to Datasources -> Connections Tab -> New Connection ->
Select Kafka
Provide all the mandatory details
Name: Name to identify this connection
Connection Type: Select the connection type. Connecting via zookeeper or bootstrap servers
Zookeeper Connection: Provide in the format zk1:2181,zk2:2181,zk3:2181
Bootstrap Server: Provide in the format host:9092
Test Connection
Create
STEP-2: Configure Kafka datasource
To learn about datasource, refer here
Navigate to Datasources -> Datasources Tab -> Add ->
Select Kafka
Provide the name -> Create
Connection Tab:
From the drop-down, select the name of connection created in STEP-2
Update
STEP-3: Create Dataset
Datasets Tab: To learn about Dataset, refer here. Add Dataset for each table that you want to replicate, providing following details
Topic Name (Required)
Automatic Schema (Required):
Yes: Schema is automatically discovered by Sprinkle (Recommended
Flatten Level (Required): Select from One Level or Multi Level. In one level, flattening will not be applied on complex type. They will be stored as string. In multi level, flattening will be applied in complex level till they become simple type.
No: Warehouse Schema to be provided Format for Warehouse schema is : Col1 datatype, Col2 datatype,Col3 datatype Datatype should be warehouse specific.
Destination Schema (Required) : Data warehouse schema where the table will be ingested into
Destination Table name (Required) : It is the table name to be created on the warehouse. If not given, sprinkle will create like ds_<datasourcename>_<tablename>
Destination Create Table Clause: Provide additional clauses to warehouse-create table queries such as clustering, partitioning, and more, useful for optimizing DML statements. Learn more on how to use this field.
Create
STEP-4: Run and schedule Ingestion
In the Ingestion Jobs tab:
Trigger the Job, using Run button
To schedule, enable Auto-Run. Change the frequency if needed
Last updated