Concepts and Terminology
This page contains an overview of the key conceptual parts used across the Validio platform.
-
Credentials are used to access your Source. One set of Credentials can be used to set up multiple Sources.
-
A Source is a connector to one source system, such as Data Warehouse, Data Stream, or an Object Storage. A Source is defined as one table in a Data Warehouse, one topic in a Data Stream, or one specified set of schema-conformant files in an Object Storage.
Segmentations, Windows, and Validators are defined for each Source -
Validio validates data using metrics calculated over a subset of the data in a Source. Each such subset is called a Window. A Window can be defined as a time interval, a fixed sized batch, or a file. There is also a Global Window, which considers all data in the Source.
Each Source must have at least one window, but several windows can be created for each Source. -
Segmentation allows validation per segment, also referred to as group. You can think of this as a
GROUP BY
statement in SQL. Each Source has at least one segment. The default Segmentation is calledUnsegmented
. -
A Validator specifies what metrics to validate on what fields and what Threshold should be considered acceptable. For Validators you can also select a defined Segmentation and Window, and configure filters.
Each Source can have one or more Validators. -
Optionally, a Notification rule can be used to send incidents to specified channels, such as Slack. Each notification rule can include incidents from multiple Sources.
-
Each Notification rule has a notification Channel attached. The same Channel can be used for multiple Notification rules.
-
Lineage describes how data flows through a data stack, from its origin to its final use. For some source types Lineage is created automatically, based on a Credential, or a dbt Manifest file. For others, Lineage can be created manually, based on a Source.
Updated 11 months ago