About Windows
A Window (batch) is required to calculate and validate metrics on a dataset in your Source.
You must configure at least one Window on your source to create Validators. You can create several Windows on the same Source, according to your business requirements.
For example, for an aggregate calculation you must define a Window and what datapoints to include to validate the mean, max, or min.
Depending on your Source type, you can configure the following Window types: Fixed Batch, Global, Tumbling, or File.
Fixed batch window
Fixed batch windows are defined by a specified number of datapoints, from a certain data-time field.
Global window
Utilizing a global window is akin to performing a full load during every polling cycle. In simpler terms, for each poll, all validator metrics are computed for the entire Source, rather than for a sequence of windows.
Validators using a global window are evaluated based on clock-time, rather than data-time.
Global window is only available for Data Warehouses and Query Engines.
Tumbling window
Tumbling windows are a series of fixed-sized, non-overlapping, and contiguous time intervals. For example, hourly
or weekly
.
Note
The smallest allowed size for a tumbling window is:
- Data Warehouses, Object Storages and Query Engines: 30 minutes
- Data Streams: 10 minutes
Validators with a Tumbling window always follow data time, which means that the time used in graphs and metrics represents the time of the dataset (such as, the time of the data in the warehouse).
Validio polls for new data on a schedule defined by the polling interval configured on the source. A tumbling window closes (finishes the computation and updates the graph) as soon as all the data for that window has arrived. Validio determines that all the data has arrived in two ways:
- New data arrives with a timestamp that is after the window end time. For example, if the window is from 13:00 to 14:00 and there is data at 14:30, Validio assumes all data between 13:00 - 14:00 has arrived.
- When
Disable window timeout
is selected and data exists within a window. For example, if the window is from 13:00 to 14:00, and a poll for new data found a data point with a time 13:30, that window will close.
In general, the graphs for validators will display data points until the last datapoint, and then stick at whenever the last window closes because the graphs do not have new data points to show unless new data arrives.
Note
The exception is seen in freshness validators, where the graph will continue to fill in intermediate data points as time passes even when no data arrives. For more information, see Freshness Validator.
File window
File windows consist of logical batches by file/BLOB. For example, one CSV file is a logical batch.
Note
The file Window type is only available for Object storage Source types.
Updated 4 months ago