Threshold types
Define conditions for when a metric should be considered a data quality incident.
When you create a Validator you must configure a Threshold to identify data quality incidents. When your data breaches a Threshold, it creates an incident that you can inspect on the Validator details page.
Optionally, you can get notified on identified incidents in different channels, such as Slack or webhooks. For information on notifications rules and channels, refer to Notifications.
Validio supports the following Threshold types:
Threshold name | Description | Example use case |
---|---|---|
Dynamic threshold | Dynamic thresholds calculate smart auto-thresholds for the metric, based on statistical methods. Includes user-adjustable sensitivity | Track daily average of sales to detect anomalies. |
Fixed threshold | Performs comparison operations between metric and specified numeric threshold, such as <,>, <=, >=, =, != | Check that no values in field ‘Age’ are less than zero. |
Dynamic threshold
Dynamic thresholds create an automatic and dynamic range on numeric metrics. The algorithm is trained on new data and continuously improves as more data is read.
Parameter name | Parameter value |
---|---|
Sensitivity | Positive float value |
Decision bounds type | Upper and lower Upper Lower |
Parameter details
Sensitivity
Higher sensitivity on the dynamic threshold means that the accepted range of values is narrower, which identifies more threshold breaches. Conversely, lower sensitivity values imply a wider range of accepted values. Typical starting values for test purposes are between 2
and 3
.
Setting the right sensitivity is often an iterative process to find a balance between false positives and Threshold fatigue versus false negatives and missing real errors.
Decision bounds type
Decision bounds types specify whether the boundaries are double or single-sided:
Upper and lower
: Both upper and lower anomalies are detected.Upper
: Only deviations upwards are treated as anomalies. Deviations downwards are never considered anomalous. Used, for example, in freshness validations monitoring time since the last update, where 'too fresh' data should not be accounted for.Lower
: Only deviations downwards are treated as anomalies. Deviations upwards are never considered anomalous.
How does it work?
Dynamic Threshold uses a combination of different smart algorithms to automatically detect anomalies in your data. It infers trends, seasonality, and peaks and can adapt to shifts in data through learning from historic data.
When applied to a backfilled source, the dynamic thresholds can quickly detect upcoming anomalies without any training period. This means you get incidents and insight immediately, even if you lack the domain knowledge to create appropriate fixed thresholds.
Shifts in seasonality and trends are continuously tracked and auto-updated, which means you can use this threshold to monitor sources that are expected to change over time.
Fixed threshold
Fixed thresholds perform comparison operations between numeric metrics and a specified threshold.
Parameter name | Parameter value |
---|---|
Operator | Equal to Not equal to Less than Less than or equal to Greater than Greater than or equal to |
Value | Numeric value |
Updated about 1 year ago