FTP

Guide to integrate your files using FTP

Datasource Concepts

Before setting up the datasource, learn about datasource concepts here

To learn about Connection, refer here

To learn about datasource, refer here

Navigate to Datasources -> Datasources Tab -> Add ->
Select FTP or FTPs
Provide the name -> Create
Connection Tab:
- From the drop-down, select the name of connection created in STEP-2
- Update

Datasets Tab: To learn about Dataset, refer here. Add Dataset for each directory that you want to replicate, providing following details

Table Name (Required) : Table name suffix which will be used to create the table in the warehouse
Directory Path (Required) :Provide the full path
Ingestion Mode (Required) :
- Complete: Full folder is downloaded and ingested in every ingestion job run
- Incremental: Ingest only the new files in every ingestion job run. Use this option if your folder is very large, and you are getting new files continuously
  - Remove Duplicate Rows:
    Unique Key: Unique key from table, to dedup data across multiple ingestions
    Time Column Name: Will be used to order data for deduping
  - Max Job Runtime: Give maximum time in minutes for which data should be downloaded. Ingestion job will run specified max minutes and checkpoint will be updated. Next run will continue from checkpoint.
File Type: Select the File Format
- JSON
- CSV
  - Select Delimiter - Comma, Tab, Pipe, Dash, Other Character
- Parquet
- ORC
Destination Schema (Required) : Data warehouse schema where the table will be ingested into
Warehouse Table name (Optional) : It is optional field. If not given, sprinkle will create like ds_<datasourcename>_<tablename>
Destination Create Table Clause: Provide additional clauses to warehouse-create table queries such as clustering, partitioning, and more, useful for optimizing DML statements. Learn more on how to use this field.
Create

In the Ingestion Jobs tab:

Last updated 1 year ago