Reference source config

Compare data with a reference.

πŸ“˜

You can only configure a reference source for the following Validator types:

You can calculate a metric in reference to a ground truth source that you specify. Using a reference source you can, for example, compare data between two completely different sources. Another example is to use a reference source to create a sliding window.

Configuration parameters

The following parameters are included in the reference source config:

Sources

Select a Source as your reference data source. You can use the same Source as in the Source config step, or another Source.

πŸ“˜

Segmentation considerations

The reference source uses the same Segmentation as specified in the Source config step. This assumes that the fields defined in the Source Segmentation are also available in the reference source schema.

If you use a different Source as reference source, verify that the schema has the same fields with matching data types.

Validator configuration wizard - Reference source config step.

Validator configuration wizard - Reference source config step.

Field

Select the relevant field to act as a reference for your metric calculation.

πŸ“˜

The field data type in the reference source config must match the field data type used in the source config.

Window

Select a configured Window from the reference source to use in the metric calculation.

For more information, refer to Windows.

Window offset

Choose by how many Windows the reference source is shifted to the past.

An offset value of 0 means that you are comparing the current data window of your source, against the current data window of your reference source.

You can choose a larger value to shift the reference source window to the past.

For example, an offset of 7 for a daily window, compares the current window of your source against the window of your reference source from 1 week ago. This assumes both the target and reference source has a daily window.

πŸ“˜

Empty reference Window?

The reference window remains empty until it can compare to datapoints from your source.

For example, if you set the window offset farther back in time compared to the lookback time specified for a Data Warehouse source type, the reference window might remain empty until it reaches the lookback time.

Number of Windows

Choose over how many Windows the metric is calculated for the reference source.

By default the number of windows is set to 1. Larger values effectively smoothen the metric calculated for the reference. Choose a number larger than 1 to compare your current data against an aggregate of the values calculated over the total number of windows.

Example of

Example that compares the current data window on a source, with the metric calculated on a reference source one week ago.

Filter

Optionally, select a filter to set a rule to determine which raw data is included in the metric calculation.

You can also select a filter to set a rule to determine which raw data is validated for your reference source.

For more information, refer to filters.

Examples

These examples illustrate how the metric is calculated for reference sources with different daily window configurations.

Example 1

Example that compares the current data window on a source with the current data window on a reference source.  
The reference source is either a different column in the same source or a different source.

Example that compares the current data window on a source, with the current data window on a reference source.
The reference source is either a different column in the same source or a different source.

Example 2

Example that compares the current data window with the calculated metrics from yesterday.

Example that compares the current data window on a source, with the calculated metric from yesterday on a reference source.

Example 3

Example that compares the current data window on a source with the avg. values calculated from the past week on a reference source.

Example that compares the current data window on a source, with an aggregate of the values calculated from the past week on a reference source.