# FTP

## Pipeline Concepts

Before setting up the Pipeline, learn about Pipeline concepts [here](https://docs.sprinkledata.com/product/ingesting-your-data/pipelines)

## Step by Step Guide

### STEP-1: Configure FTP/FTPs Connection

To learn about Connection, refer [here](https://docs.sprinkledata.com/product/ingesting-your-data/pipelines)

* Log into Sprinkle application
* Navigate to Ingest -> Connections Tab -> New Connection ->&#x20;
* Select FTP or FTPs
* Provide all the mandatory details
  * *Name*: Name to identify this connection
  * *Host*: FTP Hostname
  * *Port*: FTP port. Default is 22.
  * *User*: FTP username
  * *Password*
* Test Connection&#x20;
* Create

### STEP-2: Configure FTP/FTPs Pipeline

To learn about Pipeline, refer [here](https://docs.sprinkledata.com/product/ingesting-your-data/pipelines)

* Navigate to Ingest -> Pipeline Tab -> Add ->&#x20;
* Select FTP or FTPs
* Provide the name -> Create
* **Connection Tab**:&#x20;
  * From the drop-down, select the name of connection created in STEP-2
  * Update

### STEP-3: Create Dataset

**Datasets Tab**: To learn about Dataset, refer [here](https://docs.sprinkledata.com/product/ingesting-your-data/pipelines). Add Dataset for each **directory** that you want to replicate, providing following details

* *Table Name* (Required) : Table name suffix which will be used to create the table in the warehouse
* *Directory Path* (Required) :Provide the full path
* *Ingestion Mode* (Required) :&#x20;
  * *Complete*: Full folder is downloaded and ingested in every ingestion job run
  * *Incremental*: Ingest only the new files in every ingestion job run. Use this option if your folder is very large, and you are getting new files continuously
    * *Remove Duplicate Rows*:
      * *Unique Key:* Unique key from table, to dedup data across multiple ingestions
      * *Time Column Name*: Will be used to order data for deduping
    * *Max Job Runtime*: Give maximum time in minutes for which data should be downloaded. Ingestion job will run specified max minutes and checkpoint will be updated. Next run will continue from checkpoint.
* *File Type*: Select the File Format
  * JSON
  * CSV
    * Select Delimiter - Comma, Tab, Pipe, Dash, Other Character
  * Parquet
  * ORC
* *Destination Schema* (Required) : Data warehouse schema where the table will be ingested into
* *Warehouse Table name* (Optional) : It is optional field. If not given, sprinkle will create like ds\_\<Pipelinename>\_\<tablename>
* *Destination Create Table Clause*: Provide additional clauses to warehouse-create table queries such as clustering, partitioning, and more, useful for optimizing DML statements. [Learn more](https://docs.sprinkledata.com/product/ingesting-your-data/pipelines/databases/features/destination-create-table-clause) on how to use this field.
* Create

### STEP-4: Run and schedule Ingestion

In the **Ingestion Jobs** ta&#x62;**:**

* Trigger the Job, using Run button
* To schedule, enable Auto-Run. Change the frequency if needed&#x20;
