Monitors validate aggregate metrics on a dataset - learn more here
Monitors support metrics that are computed only on one feature, as well as metrics that are computed based on two or more features from two different sources (typically the same feature), called reference monitors (statistics).
|Dataset Monitors - computed from one feature|
|Numerical||Mean||Validating the mean of a numerical feature|
|Maximum||Validating the maximum of a numerical feature|
|Minimum||Validating the minimum of a numerical feature|
|Standard deviation||Validating the standard deviation of a dataset|
|Categorical||Mode||Validating the mode of a categorical dimension (categorical value with highest count)|
|Cardinality||Validating the cardinality of a categorical dimension (number of unique categorical values)|
||Validating principal component analysis, i.e. that the amount of variance that is explained by the primary component stays within some interval|
|Relative time||Maximum time difference||Validating the maximum difference between two datetime features OR a datetime feature and time of ingestion|
|Minimum time difference||Validating the minimum difference between two datetime features OR a datetime feature and time of ingestion|
|Mean time difference||Validating the mean difference between two datetime features OR a datetime feature and time of ingestion|
|Null distribution||Count||Validating the count of null values for a feature|
|Fraction||Validating the fraction of null values for a feature|
|Data volume||Row count||Validating number of total rows|
You need to add a reference source in the dataset pipeline configuration in order to use reference monitors
|Dataset reference Monitors|
|Numerical||Relative entropy||Calculating relative entropy between the target dataset and reference dataset (used to check distribution shifts)|
||Comparing the mean of the target dataset to the mean of a reference dataset|
||Comparing the maximum of the target dataset to the maximum of a reference dataset|
||Comparing the minimum of the target dataset to the minimum of a reference dataset|
||Comparing the standard deviation of the target dataset to the standard deviation of a reference dataset|
|Categorical||Relative entropy||Calculating relative entropy between the target dataset and reference dataset (used to check distribution shifts)|
||Comparing the number or ratio of categories added.|
||Comparing the number or ratio of categories removed.|
||Comparing the number or ratio of categories that have changed. This is the number of categories added plus the number of categories removed.|
||Comparing the PCA results of the target dataset to the PCA results of a reference dataset.|
In order to set up a monitor, navigate to the dataset pipeline that you want to add the monitor to. If you haven’t set up a dataset pipeline yet, do so by following the steps found here. Once you’ve navigated to dataset pipeline details, follow the steps below.
Click on the “New monitor”-button. If you’d like to set up multiple monitors at the same time, use the “New monitors”-button.
The wizard will guide you through the setup which consists of two steps. First step is to select which type of monitor you want to set up. Refer to the table below to get an overview of the Monitors and the pages to the left for details on the Monitors.
In the second step you will configure the monitor and decide what metrics to compute, again parameter values will differ depending on which monitor you pick.
After completing the monitor setup wizard you can navigate to the monitor’s details page to see a history graph of the computed metric and also continue setting up alerts to be triggered based on the computed metric. You navigate to the monitor’s details page by clicking on the monitor in the list (1).
Updated 4 months ago