HomeDocumentationRecipesChangelog
HomeRequest DemoContact
Documentation
HomeRequest DemoContact

Configuring Windows

Data is validated in batches, called windows, which you can configure on your source. You can then select the configured window when you create new validators or segmentations on the source.

To a window to a source,

  1. Navigate to the source's details page and click the Windows tab.
  2. Click + New window.
  3. Under Select Type, choose the Window type that you want to create: Fixed Batch, Global, Tumbling, or File. You will only be able to to select the window type that is valid for this source. For example, file windows are only supported for object storage sources.
  4. Under Configuration, specify the required config options for your Window type. For more information refer to the configuration parameters for each Window type.

📘

Note

When configuring windows, a time field is used. The format of the time field is converted to UTC in Validio, but the graphs will always display times in your system's timezone.

  1. (Optional) Under Segment retention period (days), enter the maximum number of days to keep segments when new data has not been seen.
  2. Enter a Name for the window, or click Generate name to automatically create one based on your configuration.
  3. Click Create Window.

After creating a window, you can add validators to use the new window. For more information, see Configuring a Validator.

Configuration Parameters

Fixed Batch Window

Field

Value

Description

Data-time field

Field name

Identifier for the index field used to configure the Window.

Batch size

Numeric

Number of datapoints (rows) of the Window. For example, 256.

Segmented batching

True
False

If True, each segment gets a separate Window of batch size length.

Global Window

Global window requires no configuration.

Tumbling Window

Field

Values

Description

Data-time field

Field name

The name of the field that references the timestamp associated with each record or row in the data source.

Window size

Numeric

Length of the Window in the selected time unit. For example, 1

Unit

Minute
Hour
Day
Week
Month

Unit of time to define Window size.

Disable window timeout

True
False

Set to true if the window should be automatically closed without considering the most recent data-time.

File Window

FieldValueDescription
Data-time fieldField nameIdentifier for the field used to configure the Window.

📘

Note

File window datasets are often used for distribution shift validation, such as in ML use cases to monitor data drifts.

If you use a production training dataset as reference dataset, as new data is collected, you can monitor distribution shift metrics between the reference dataset and the newly collected dataset.

For information on numeric reference metrics, such as relative entropy, refer to the Numeric distribution or Categorical distribution Validator types.

Segment Retention Policy

Segment retention period (days) is an optional setting on validator windows that sets a threshold to remove segments that may have become stale. The segment is considered stale when the last time data was processed on the segment exceeds the retention period.

📘

Note

The threshold is relative to the most recent segment that was processed. When left unset, Validio does not clean or remove stale segments.