SFTP
Guide to integrate your files using FTP
Last updated
Guide to integrate your files using FTP
Last updated
Before setting up the datasource, learn about datasource concepts
To learn about Connection, refer
Log into Sprinkle application
Navigate to Datasources -> Connections Tab -> New Connection ->
Select SFTP
Provide all the mandatory details
Name: Name to identify this connection
SSH Host: IP address or hostname of the SSH server.
SSH Port: Port of the SSH server. Default is 22.
SSH Login Username: SFTP username
Choose Authentication Mode (either one of given below) and enter the required information as follows :-
SSH Public Key
Password
Test Connection
Create
Navigate to Datasources -> Datasources Tab -> Add ->
Select SFTP
Provide the name -> Create
Connection Tab:
From the drop-down, select the name of connection created in STEP-2
Update
File Type: Select the File Format
JSON
CSV
Select Delimiter - Comma, Tab, Pipe, Dash, Other Character
Directory Path (Required) :Provide the full path
Ingestion Mode (Required) :
Complete: Full folder is downloaded and ingested in every ingestion job run
Incremental: Ingest only the new files in every ingestion job run. Use this option if your folder is very large, and you are getting new files continuously
Remove Duplicate Rows:
Unique Key: Unique key from table, to dedup data across multiple ingestions
Time Column Name: Will be used to order data for deduping
Max Job Runtime: Give maximum time in minutes for which data should be downloaded. Ingestion job will run specified max minutes and checkpoint will be updated. Next run will continue from checkpoint.
Destination Schema (Required) : Data warehouse schema where the table will be ingested into
Destination Table Name (Required) : Table name suffix which will be used to create the table in the warehouse
Create
In the Ingestion Jobs tab:
Trigger the Job, using Run button
To schedule, enable Auto-Run. Change the frequency if needed
To learn about datasource, refer
Datasets Tab: To learn about Dataset, refer . Add Dataset for each directory that you want to replicate, providing following details
Destination Create Table Clause: Provide additional clauses to warehouse-create table queries such as clustering, partitioning, and more, useful for optimizing DML statements. on how to use this field.