Sprinkle Docs
Search…
⌃K
📒

Data Catalog

Data Catalog Explanation & Detailed Feature Walkthrough

Introduction

Modern-day businesses
🏭
generate huge data from various sources which are stored at varied locations. Data can be in files, relational databases, data warehouses and the cloud.
Sprinkle’s No-Code Data integration & Data Analytics solutions enable organisations to bring all data to one place and perform a holistic, 360 degrees analysis.
Sprinkle presents, the Data Catalog, to organize, discover, manage & use the data with ease.

Watch Video
📺

Catalog: Explanation & Feature Walkthrough

What is Data Catalog❓

A Data Catalog is a catalogue
📒
of all data assets of the organization along with tools
🧰
that help users to locate the data required for the analysis.
It is a go-to place for all data related needs of the users. Data Catalog enables users to “Search & Discover” data, “Understand” the context to the data through Technical & Business Metadata, and “Manage” data and its access.

Feature Walkthrough 🚶

From the left navigation panel,
🖱
click on the Catalog Option to use the Data Catalog.

Table Listing Page

The Catalog homepage is in the form of a table listing page which provides brief information about the tables in the data warehouse.
It presents a tabular view of the information & the Tags. The metadata columns are highlighted in the below image. The table listing page can be customized to include the columns as needed using the customize column display icon on the right.
Table listing Page: Business Metadata Columns & Customize Column DIsplay Options
The search bar
🔍
can be used to search the schema using the keywords. To view the details about the table,
🖱
click on the table.

Table Overview

On clicking on a table, the overview section opens up. To view column level stats
🖱
click on Refresh Stats.
The stats are displayed at the column level, the name of the column, its datatype, description (provided by the user), total distinct values and Missing values in the column.
Enable Advance Stats Toggle Button to generate Stats like the distinct values, missing values & frequency histograms which shows the spread of data.
Make sure to run Refresh Stats after enabling the Advance Stats Button to display the stats & the Frequency Charts.
Advance Stats & Refresh Stats

Jobs

The jobs tab enables the user to run jobs and keep a track of the older jobs.
For a periodic refresh of stats, users can enable Autorun and set frequency according to their requirements.
Autorun

Preview

The Preview tab shows the top rows of the table. Using Show Entries you can view up to 500-row entries.
Preview

Pipeline & Lineage Graph

Pipeline and Lineage help identify dependencies to this table and do impact analysis. The pipeline Graph shows the sequence of processes that create or update this particular table.
Pipeline Graph
Lineage shows all the input tables that are used to create this particular table.
Lineage

Status

The status drop-down at the top can be used by the owner or the data stewards to correctly tag the data assets. This would help other users to identify the status of the data before using it further. The tags available are, WIP, Verified, Deprecated & Has Issues.
Status Drop-down
On the Table Page, users can add Tags by
🖱
clicking on the Tags button.
Go through the documentation to know more Tags