Smart
Filter individual outlier datapoints based on empirical data
Configuration parameters
Parameter name and description  Parameter values 
1. Name  Arbitrary String 
2. Target feature  List of source features with numeric data types 
3. Sensitivity  Positive float number 
4. Smoothing  Positive integer >= 1 
5. Minimum absolute difference  Positive float >= 0 
6. Minimum relative difference  Positive float >= 0 (%) 
7. Computed metric 

Parameter details
For each sessionized datapoint batch, a smart filter will compare all of the new datapoints to a modeled empirical distribution based on the most recent batches. How much empirical data is taken into account in the modeled distribution is governed by the smoothing parameter, while the sensitivity parameter controls the bounds for what should be considered an anomaly.
Setting the right parameter values is often an iterative process in the beginning to balance false positives and alert fatigue vs. false negatives and missing out on real errors
Sensitivity
A higher value causes more datapoints to be labeled as anomalies. For example, a sensitivity value of 3 will label all datapoints beyond 3𝛔 from the mean as anomalies, where 𝛔 is the estimated standard deviation. The sensitivity value is inversely proportional to the bounds outside which datapoints are considered anomalies:
Anomaly bound = (3/X)*3𝛔 where X is the sensitivity value
E.g. a score of 5 would give (3/5)*3𝛔 = 1.8𝛔, i.e. all datapoints beyond 1.8𝛔 would be considered anomalies.
Typical starting values to test are 23.
Smoothing
The smoothing parameter governs how much of the historical data will be taken into account when modeling the empirical distribution. Heuristically, the value of the smoothing parameter can roughly be thought of as the number of historic sessionized batches taken into account.
Choose a lower smoothing value if you know that your data is prone to some distribution shifts, higher smoothing values if you expect your distribution to be fairly stable.
Typical starting values to test are 510.
Minimum absolute difference
The minimum absolute difference between the feature value and the mean of the reference distribution for the point to be considered an outlier/anomaly.
E.g. if set to '10', the difference between the mean of the reference distribution and the point being validated needs to be greater than 10, or less than 10, and be outside the bounds the smart filter sets to be considered an anomaly. This is essentially an "ignore any alerts within the difference' parameter
Minimum relative difference
Minimum difference for points to be considered an anomaly expressed in relative terms, dividing 'absolute difference' with the absolute of the mean of the reference data.
E.g. if the mean of the reference distribution is 10, and user sets 10% as parameter value. Data points falling between 9 and 11 will not be considered anomalies.
Use this option instead of 'Minimum absolute difference' when you care more about the relative difference to the reference mean than the absolute difference.
Updated 4 days ago