Validator Types

Overview of supported Validator types used for calculating metrics.

πŸ“˜

Map of the territory

The Validator overview and Validator configuration pages explain the concepts related to Validators and provide useful context for configuration.

πŸ“˜

Validators calculate metrics over a window

For example, a Validator calculates the mean value over a daily window, and then validates if these daily mean values follow an expected seasonal pattern.

List of Validator types and supported metrics

Single Source Validators

Single Source Validators calculate metrics based on one dataset.

Validator typeMetricDescription
NumericMeanValidating the mean of a numerical field.
MaximumValidating the maximum of a numerical field.
MinimumValidating the minimum of a numerical field.
Standard DeviationValidating the standard deviation of a numerical field.
SumValidating the sum of a numerical field.
Relative timeMinimum differenceValidating the minimum difference between two date fields.
Maximum differenceValidating the maximum difference between two date fields.
Mean differenceValidating the mean difference between two date fields.
VolumeCountValidating number of total rows.
PercentageValidating percentage of rows passing certain filter criteria.
Duplicate countValidating the number of duplicates.
Duplicate percentageValidating the percentage of duplicates.
Unique countValidating the number of unique rows.
Unique percentageValidating the percentage of unique rows.
FreshnessFreshnessValidate the time elapsed since the data was last updated.

Reference Validators

Reference Validators calculate metrics based on multiple fields from two different datasets. The two datasets can either be derived from completely different Sources, or from different windows within the same Source.

These Validators only calculate metrics if there is data in the target dataset. For example, a Categories removed Validator where the reference dataset has 4 categories and the target dataset has 3 categories, yields a result of 1. Conversely, if the target dataset has 0 categories, the Validator does not yield any result. Since this means that the target dataset has no data to calculate metrics on.

Validator typeMetricDescription
Numeric anomalyCountNumber of datapoints identified as anomaly.
PercentagePercentage of datapoints identified as anomaly.
Numeric distributionRelative entropyValidate differences in the distribution between two datasets.
Mean ratioValidate the ratio of the mean between two datasets.
Maximum ratioValidate the ratio of the maximum value between two datasets.
Minimum ratioValidate the ratio of the minimum value between two datasets.
Standard deviation ratioValidate the ratio of the standard deviation between two datasets.
Relative volumeCount ratioβ€ŽValidating the ratio of the number of rows in the target dataset and the number of rows in the reference dataset.
Percentage ratioValidating the number of rows in the target dataset as a percentage of the number of rows in the reference dataset.
Categorical distributionCategories addedβ€ŽValidating the number of new categories in target dataset against a reference dataset.
Categories removedValidating the number of removed categories in the target dataset against a reference dataset.
Categories changedValidating the number of changed categories in the target dataset against a reference dataset.
Relative entropyValidate differences in the distribution of a categorical field between two datasets.