About Windows
A Window (batch) is required to calculate and validate metrics on a dataset in your Source.
You must configure at least one Window on your source to create Validators. You can create several Windows on the same Source, according to your business requirements.
For example, for an aggregate calculation you must define a Window and what datapoints to include to validate the mean, max, or min.
Window types
Depending on your Source type, you can configure the following Window types:
Fixed batch window
Fixed batch windows are defined by a specified number of datapoints, from a certain data-time field.
Global window
Utilizing a global window is akin to performing a full load during every polling cycle. In simpler terms, for each poll, all validator metrics are computed for the entire Source, rather than for a sequence of windows.
Validators using a global window are evaluated based on clock-time, rather than data-time.
Global window is only available for Data Warehouses and Query Engines.
Tumbling window
Tumbling windows are a series of fixed-sized, non-overlapping, and contiguous time intervals. For example, hourly
or weekly
.
The smallest allowed size for a tumbling window is:
- Data Warehouses, Object Storages and Query Engines: 30 minutes
- Data Streams: 10 minutes
File window
File windows consist of logical batches by file/BLOB. For example, one CSV file is a logical batch.
The file Window type is only available for Object storage Source types.
Updated 10 months ago