# SFTP

## Pipeline Concepts

Before setting up the Pipeline, learn about Pipeline concepts [here](/product/ingesting-your-data/pipelines.md)

## Step by Step Guide

### STEP-1: Configure SFTP Connection

To learn about Connection, refer [here](/product/ingesting-your-data/pipelines.md)

* Log into Sprinkle application
* Navigate to Ingest -> Connections Tab -> New Connection ->&#x20;
* Select SFTP
* Provide all the mandatory details
  * *Name*: Name to identify this connection
  * *SSH Host*: IP address or hostname of the SSH server.
  * *SSH Port*: Port of the SSH server. Default is 22.
  * SSH Login Username: SFTP username
  * Choose Authentication Mode (either one of given below) and enter the required information as follows :-
    * SSH Public Key
    * Password
* Test Connection&#x20;
* Create

### STEP-2: Configure SFTP Pipeline

To learn about Pipeline, refer [here](/product/ingesting-your-data/pipelines.md)

* Navigate to Ingest -> Pipeline Tab -> Add ->&#x20;
* Select SFTP
* Provide the name -> Create
* **Connection Tab**:&#x20;
  * From the drop-down, select the name of connection created in STEP-2
  * Update

### STEP-3: Create Dataset

**Datasets Tab**: To learn about Dataset, refer [here](/product/ingesting-your-data/pipelines.md). Add Dataset for each **directory** that you want to replicate, providing following details

* *File Type*: Select the File Format
  * JSON
  * CSV
    * Select Delimiter - Comma, Tab, Pipe, Dash, Other Character
* *Directory Path* (Required) :Provide the full path
* *Ingestion Mode* (Required) :&#x20;
  * *Complete*: Full folder is downloaded and ingested in every ingestion job run
  * *Incremental*: Ingest only the new files in every ingestion job run. Use this option if your folder is very large, and you are getting new files continuously
    * *Remove Duplicate Rows*:
      * *Unique Key:* Unique key from table, to dedup data across multiple ingestions
      * *Time Column Name*: Will be used to order data for deduping
    * *Max Job Runtime*: Give maximum time in minutes for which data should be downloaded. Ingestion job will run specified max minutes and checkpoint will be updated. Next run will continue from checkpoint.
* *Destination Schema* (Required) : Data warehouse schema where the table will be ingested into
* *Destination Table Name* (Required) : Table name suffix which will be used to create the table in the warehouse
* *Destination Create Table Clause*: Provide additional clauses to warehouse-create table queries such as clustering, partitioning, and more, useful for optimizing DML statements. [Learn more](/product/ingesting-your-data/pipelines/databases/features/destination-create-table-clause.md) on how to use this field.
* Create

### STEP-4: Run and schedule Ingestion

In the **Ingestion Jobs** ta&#x62;**:**

* Trigger the Job, using Run button
* To schedule, enable Auto-Run. Change the frequency if needed&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.sprinkledata.com/product/ingesting-your-data/pipelines/files/sftp.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
