Smart

Smart alerts creates an automatic and dynamic range on numeric metrics based on historical data leveraging an exponential smoothing algorithm

Configuration parameters

Parameter name and description Parameter values
1. Name Arbitrary String
2. Sensitivity Positive float value
3. Smoothing Positive integer value >= 1
4. Peak detection Checkbox
5. Decision bounds type
  • Upper and lower
  • Upper
  • Lower

Parameter details

Sensitivity

Higher sensitivity on the alert means that more alerts will be identified - i.e. the accepted range of values will be narrower. Conversely, lower sensitivity values implies wider range of accepted values

Typical starting values to test ranges between 2-3.

Smoothing

Higher smoothing values implies more historic data is taken into account when calculating the accepted range. Heuristically, the value of the smoothing parameter can roughly be thought of as the number of historic Metric datapoints taken into account.

Typical starting values to test ranges between 5-10

👍

Setting the right sensitivity and smoothing value is often an iterative process in the beginning to balance false positives and alert fatigue vs. false negatives and missing out on real errors

Peak detection

Used when you are primarily interested in monitoring 'peak' and 'valley' metric values (local minimum and local maximum). For example freshness monitors:

  • Freshness monitors calculate time since latest batch and will be increasing until new batch arrived
  • If you expect e.g. daily jobs, you expect the 'peak value' to consistently be around ~1 day before dropping to zero when a new job arrives
  • The values leading up to 1 day is not very interesting, and we are only interesting in monitoring that all peak values should be around ~1 day. When this is not the case, send an alert

For use cases above, peak detection should be checked.

Decision bounds type

Decision bounds types specifies whether the boundaries should be double or single sided.

  • Upper and lower: Both upper and lower anomalies will be detected
  • Upper: Only deviations upwards will be treated as anomalies, deviations downwards will never be considered anomalous (used e.g. for freshness validations monitoring time since last update, and 'too fresh' data should not be alerted for)
  • Lower: Only deviations downwards will be treated as anomalies, deviations upwards will never be considered anomalous