Sprinkle Docs
Search…
πŸ“’
Data Catalog
Data Catalog Explanation & Detailed Feature Walkthrough

Introduction

Modern-day businesses
🏭
generate huge data from various sources which are stored at varied locations. Data can be in files, relational databases, data warehouses and the cloud.
Sprinkle’s No-Code Data integration & Data Analytics solutions enable organisations to bring all data to one place and perform a holistic, 360 degrees analysis.
Sprinkle presents, the Data Catalog, to organize, discover, manage & use the data with ease.

Watch Video
πŸ“Ί
​

Data Catalog: Explanation & Feature Walkthrough

What is Data Catalog❓

A Data Catalog is a catalogue
πŸ“’
of all data assets of the organization along with tools
🧰
that help users to locate the data required for the analysis.
It is a go-to place for all data related needs of the users. Data Catalog enables users to β€œSearch & Discover” data, β€œUnderstand” the context to the data through Technical & Business Metadata, and β€œManage” data and its access.

Feature Walkthrough 🚢

From the left navigation panel,
πŸ–±
click on the Catalog Option to use the Data Catalog.

Table Listing Page

The Catalog homepage is in the form of a table listing page which provides brief information about the tables in the data warehouse.
It presents a tabular view of the information & the Business Metadata. The metadata columns are highlighted in the below image. The table listing page can be customized to include the columns as needed using the customize column display icon on the right.
Table listing Page: Business Metadata Columns & Customize Column DIsplay Options
The search bar
πŸ”
can be used to search the schema using the keywords. To view the details about the table,
πŸ–±
click on the table.
Search

Table Overview

On clicking on a table, the overview section opens up. To view column level stats
πŸ–±
click on Refresh Stats.
The stats are displayed at the column level, the name of the column, its datatype, description (provided by the user), total distinct values and Missing values in the column.
Enable Advance Stats Toggle Button to generate Stats like the distinct values, missing values & frequency histograms which shows the spread of data.
Make sure to run Refresh Stats after enabling the Advance Stats Button to display the stats & the Frequency Charts.
Advance Stats & Refresh Stats

Jobs

The jobs tab enables the user to run jobs and keep a track of the older jobs.
For a periodic refresh of stats, users can enable Autorun and set frequency according to their requirements.
Autorun

Preview

The Preview tab shows the top rows of the table. Using Show Entries you can view up to 500-row entries.
Preview

Pipeline & Lineage Graph

Pipeline and Lineage help identify dependencies to this table and do impact analysis. The pipeline Graph shows the sequence of processes that create or update this particular table.
Pipeline Graph
Lineage shows all the input tables that are used to create this particular table.
Lineage

Status

The status drop-down at the top can be used by the owner or the data stewards to correctly tag the data assets. This would help other users to identify the status of the data before using it further. The tags available are, WIP, Verified, Deprecated & Has Issues.
Status Drop-down
On the Table Page, users can add Business Metadata by
πŸ–±
clicking on the Metadata button.
Go through the documentation to know more Business Metadata​
​
Copy link
On this page
Introduction
Watch Video
What is Data Catalog❓
Feature Walkthrough 🚢
Table Listing Page
Table Overview
Jobs
Preview
Pipeline & Lineage Graph
Status