HomeDocumentationChangelog
HomeRequest DemoContact
HomeRequest DemoContact

Threshold types

Define conditions for when a metric should be considered a data quality incident.

When you create a Validator you must configure a Threshold to identify data quality incidents. When your data breaches a Threshold, it creates an incident that you can inspect on the Validator page.

Optionally, you can get notified on identified incidents in different channels, such as Slack. For information on notifications rules and channels, refer to Notifications.

Validio supports the following Threshold types:

Threshold nameDescriptionExample use caseTest type
Dynamic thresholdDynamic thresholds calculate smart auto-thresholds for the metric, based on statistical methods. Includes user-adjustable sensitivityTrack daily average of sales to detect anomalies.Statistical- and ML-based tests enable quick set-up and coverage of multiple tests. The thresholds are adaptive and adjust dynamically to the data.
Fixed thresholdPerforms comparison operations between metric and specified numeric threshold, such as <,>, <=, >=, =, !=Check that no values in ‘Age’ feature is less than zero.Manual tests to transfer domain knowledge and knowledge of dataflows.
Monotonic thresholdPerforms comparison operations between latest calculated metric and second latest calculated metrics, such as <,>, <=, >=, =, !=Check that each ‘row count’ metric is bigger than the last, to check that the table is updated with at least one record each batch.Manual tests to transfer domain knowledge and knowledge of dataflows.

Dynamic threshold

Dynamic thresholds create an automatic and dynamic range on numeric metrics. The algorithm is trained on new data and continuously improves as more data is read.

Parameter nameParameter value
SensitivityPositive float value
Decision bounds typeUpper and lower
Upper
Lower

Parameter details

Sensitivity

Higher sensitivity on the dynamic threshold means that the accepted range of values is narrower, which identifies more threshold breaches. Conversely, lower sensitivity values imply a wider range of accepted values. Typical starting values used for test purposes are between 2 and 3.

📘

Setting the right sensitivity is often an iterative process to find a balance between false positives and Threshold fatigue vs. false negatives and missing real errors.

Decision bounds type

Decision bounds types specify whether the boundaries are double or single-sided:

  • Upper and lower: Both upper and lower anomalies are detected.
  • Upper: Only deviations upwards are treated as anomalies. Deviations downwards are never considered anomalous. Used, for example, in freshness validations monitoring time since the last update, where 'too fresh' data shouldn't be accounted for.
  • Lower: Only deviations downwards are treated as anomalies. Deviations upwards are never considered anomalous.

How it works

Dynamic Threshold uses a combination of different smart algorithms to automatically detect anomalies in your data. It infers trends, seasonality, and peaks and can adapt to shifts in data through learning from historic data.

Once applied to a backfilled source, the dynamic thresholds can quickly detect upcoming anomalies without any training period. This means you get incidents and insight immediately, even if you lack the domain knowledge to create appropriate fixed thresholds.

Shifts in seasonality and trends are continuously tracked and auto-updated, which means you can use this threshold to monitor sources that are expected to change over time.

Fixed threshold

Fixed thresholds perform comparison operations between numeric metrics and a specified threshold.

Parameter nameParameter value
OperatorEqual to
Not equal to
Less than
Less than or equal to
Greater than
Greater than or equal to
ValueNumeric value

Monotonic threshold

Monotonic thresholds compare the latest calculated metric with the second latest calculated metric.

Parameter nameParameter value
OperatorEqual to
Not equal to
Less than
Less than or or equal to
Greater than
Greater than or equal to

What is compared?

A monotonic threshold compares the calculated numeric metric from the last window (t) with the metric from the second latest window (t-1).

When should you use Monotonic thresholds?

Monotonic thresholds are suitable in the following example use cases:

  • Check that row count is always strictly increasing ensuring each batch update adds new records.
  • Check that invoicing amount by each customer is increasing or stays the same to catch customers scaling down their service/product usage.
  • Check that inventory is not equal to the preceding transaction, knowing that in each record representing a transaction, an item is either bought or returned and inventory should be updated accordingly.
  • Check that loan-principal is decreasing or stays the same, knowing new loans are not taken and automatic mortgage payments are in place.