Validator Types
Overview of supported Validator types used for calculating metrics.
Map of the territory
The Validator overview and Validator configuration pages explain the concepts related to Validators and provide useful context for configuration.
Validators calculate metrics over a window
For example, a Validator calculates the mean value over a daily window, and then validates if these daily mean values follow an expected seasonal pattern.
List of Validator types and supported metrics
Single Source Validators
Single Source Validators calculate metrics based on one dataset.
Validator type | Metric | Description |
---|---|---|
Numeric | Mean | Validating the mean of a numerical field. |
Maximum | Validating the maximum of a numerical field. | |
Minimum | Validating the minimum of a numerical field. | |
Standard Deviation | Validating the standard deviation of a numerical field. | |
Sum | Validating the sum of a numerical field. | |
Relative time | Minimum difference | Validating the minimum difference between two date fields. |
Maximum difference | Validating the maximum difference between two date fields. | |
Mean difference | Validating the mean difference between two date fields. | |
Volume | Count | Validating number of total rows. |
Percentage | Validating percentage of rows passing certain filter criteria. | |
Duplicate count | Validating the number of duplicates. | |
Duplicate percentage | Validating the percentage of duplicates. | |
Unique count | Validating the number of unique rows. | |
Unique percentage | Validating the percentage of unique rows. | |
Freshness | Freshness | Validate the time elapsed since the data was last updated. |
Reference Validators
Reference Validators calculate metrics based on multiple fields from two different datasets. The two datasets can either be derived from completely different Sources, or from different windows within the same Source.
These Validators only calculate metrics if there is data in the target dataset
. For example, a Categories removed
Validator where the reference dataset
has 4
categories and the target dataset
has 3
categories, yields a result of 1
. Conversely, if the target dataset
has 0
categories, the Validator does not yield any result. Since this means that the target dataset
has no data to calculate metrics on.
Validator type | Metric | Description |
---|---|---|
Numeric anomaly | Count | Number of datapoints identified as anomaly. |
Percentage | Percentage of datapoints identified as anomaly. | |
Numeric distribution | Relative entropy | Validate differences in the distribution between two datasets. |
Mean ratio | Validate the ratio of the mean between two datasets. | |
Maximum ratio | Validate the ratio of the maximum value between two datasets. | |
Minimum ratio | Validate the ratio of the minimum value between two datasets. | |
Standard deviation ratio | Validate the ratio of the standard deviation between two datasets. | |
Relative volume | Count ratio | Validating the ratio of the number of rows in the target dataset and the number of rows in the reference dataset. |
Percentage ratio | Validating the number of rows in the target dataset as a percentage of the number of rows in the reference dataset. | |
Categorical distribution | Categories added | Validating the number of new categories in target dataset against a reference dataset. |
Categories removed | Validating the number of removed categories in the target dataset against a reference dataset. | |
Categories changed | Validating the number of changed categories in the target dataset against a reference dataset. | |
Relative entropy | Validate differences in the distribution of a categorical field between two datasets. |
Updated 10 months ago