In Validio, you can configure Segmentation to validate metrics on segments of your data. For example, if a Segmentation is specified for the field
Marital status, metrics are validated for each distinct value within that field.
Segmentation works similar to
GROUP BYin SQL.
You can create Segmentations on multiple fields. For example, if a Segmentation is specified for the fields
Marital status, the metric average
Annual salary is validated for each combination of distinct values within
Validio has customers that use thousands of segments, and is continuously increasing the number of supported segments. However, it is important to consider that a very large number of segments might have a performance impact.
Information loss occurs when aggregating data. Conversely, by segmenting the data, a more granular analysis can be performed.
A retail organization wants to validate their price data, to make sure their products are properly priced. For validating purposes, they want to use the fields
currency. Because of differences in currency, the prices have different orders of magnitude, which means that only validating datapoints from the
price column makes little sense. The data must be segmented based on currency, before performing a data quality validation to make sure there are no anomalies,
Think of the difference in the order of magnitude if the same price for a specific item is expressed in USD versus Iranian Rial, where the conversion rate is ~ 1 USD = ~ 40 000 Iranian Rial.
If no Segmentation is applied to validate price data in
currency, the retail organization would be comparing apples with cars.
Updated 3 months ago