# Python Notebooks

## Overview

You can use your favourite  **Notebooks** on Sprinkle. Through the navigation panel :mouse\_three\_button:click on Notebooks.

The  Notebook :notebook\_with\_decorative\_cover:is an open-source web application that allows you to create and share documents that contain live code, narrative text, equations, and visualizations:bar\_chart::chart:.

Use notebooks for data cleaning, transformations, numerical simulation, statistical modeling, data visualization & machine learning.

### Watch Video :tv:

{% embed url="<https://youtu.be/8U09MolZ0Tk>" %}
Python Notebooks : Explanation & Feature Walkthrough
{% endembed %}

## Feature Walkthrough :person\_walking:

### Create Python Notebook

* :mouse\_three\_button: Click on **Transform -> Python Notebooks** on the left navigation pane, to start using the Python Notebooks feature on Sprinkle. The listing page lists all the Python Notebooks that have been created.
* :mouse\_three\_button: Click on **Create New Notebook** on the top right corner of the page to create a new Python Notebook.
* Provide a **name** for the Python Notebook.&#x20;
* Select **Kernel (Optional)**: Select **python3** from the dropdown.
* Select **VM Size (Optional)**: Select from the below options of CPUs and Virtual Machine Memory Size from the dropdown for the Python Notebook.
  * Option 1 - 1 CPU & 1700 Mi (Mebibyte) Virtual Storage Memory.&#x20;
  * Option 2 - 2 CPUs & 1800 Mi (Mebibyte) Virtual Storage Memory.
* **User API Key** and **User API Secret** are optional to fill in this form. In case you want to use the Sprinkle SDK functions, it is mandatory to provide the API Key and API Secret. In the settings, these can also be provided after the Python Notebook is created.
  * To generate API Key and Secret, click on your user icon on the top right, then Account -> API Keys. :mouse\_three\_button:Click on **Generate new,** to create a new API Key and Secret for yourself.

### Using Sprinkle SDK

Sprinkle SDK enables you to Import your data from sprinkle’s  SQL Explore and Reports to be used in the notebook

* **Import Sprinkle SDK**

```
from sprinkleSdk import SprinkleSdk as sp
```

* **Read Report**

Reads data from the mentioned report into a data frame

```
df = sp.readReport('<report_id>')
```

* **Read SQL Explore**

Reads data from the mentioned SQL Explore into a data frame

```
df = sp.readExplore('<explore_id>')
```

Once data is imported, you can run all kinds of analyses using these data in your Notebook

* **Create a table or update an existing table  in the warehouse using a data frame**

{% code overflow="wrap" %}

```
sp.createOrUpdateTable('<PipelineName>','<destinationTableName>', df)
```

{% endcode %}

Multiple tables can be created in a single Pipeline. The Pipeline created using the above function can be seen in the Ingest -> File Uploads.&#x20;

* **Drop the table from the warehouse**

```
sp.dropTable('<table_name>')
```

### **How to work on Spark session operations?**

* **Get spark session with default configurations**

```
spark = sp.getOrCreate()
```

* **Change the spark app name while creating the default spark session**

```
spark = sp.appName('some-name').getOrCreate()
```

* **Get a spark session where the user can customize the configuration**

```
spark = sp.sparkBuilder()
appName('some-name')
.config("spark.some.config.option1", "some-value")
.config("spark.some.config.option2", "some-value")
.getOrCreate()
```
