HomeDemoContact

Configuration

Configure a Source connector to integrate your data source with the platform. Validio can then read the data for validation and monitoring purposes.

1. Select type

Click + New source and select the Source type you want to connect.

Source configuration wizard - select source type.

Source configuration wizard - Select Source type.

2. Credentials

You can either select configured Credentials with access to the particular Source, or create a new one.

πŸ“˜

Credential parameters look different depending on the Source

For more information, refer to the respective Source type page.

Source configuration wizard - select credentials or create new credentials.

Source configuration wizard - select credentials or create new credentials.

3. Config

Specify which data asset you want to Validio to read.

You can either fill in the config manually, or choose from listed suggestions if available:

  • Available datasets: If Validio have permission to read associated datasets within a project, you can select datasets from your Source.
  • Recent browser selections: If Validio don't have permissions, your browser shows recent selections for each field.

πŸ“˜

Configuration parameters look different depending on the Source

For more information, refer to the respective Source type page.

Source configuration wizard - Source specific config parameters

Source configuration wizard - Source specific config parameters

3.1 Polling interval

For Data warehouses and Object storage, you must set the polling interval parameter to specify how often data is read by Validio.

πŸ“˜

You can set this parameter either using presets or a custom 5-digit cron expression.

For cron schedule expressions, refer to a cron editor, such as https://crontab.guru/.

For Data streams, the polling interval parameter does not exist, since data is read as soon as it is available from the stream processor.

4. Schema

Schema inference is required for Validio to correctly validate the data and data type of the selected fields. To simplify this process, the platform first attempts to infer the schema from your data. Depending on the Source type, this might take a few seconds.

🚧

Automatic schema inference only works if there is actual data in the Source.

For example, your stream isn’t empty and the file in an Object storage contains data.

Source configuration wizard - Define schema.

Source configuration wizard - Define schema.

4.1 Nullable fields

You can select the nullable checkbox if the null values are accepted values in the field.

If NULL exists in a field where the nullable checkbox is not selected, this particular datapoint is not included in the Validator metrics. For example, in a row count Validator, the datapoint is ignored. You must select the nullable checkbox to validate null values, such as share of null.

πŸ“˜

The nullable checkboxes are only available for source types where you can configure nullable fields. It is not available where Validio reads the nullability as part of the schema.

Source configuration wizard - Optionally, select nullable fields.

Source configuration wizard - Optionally, select nullable fields.

4.2 Cursor field

When you configure Data Warehouse Source types, you must select a cursor field and a lookback time for Validio to read the data.

🚧

Lookback time

The lookback time specifies how far back in time Validio reads data from your source. Choosing a lookback time farther back in the past can lead to longer query time and increased costs when backfilling data.

For information on cursor field considerations, refer to Data Warehouse.

5. Window

Select a Window type and configure your Window.

Source configuration wizard - Configure a Window.

Source configuration wizard - Configure a Window.

6. Next steps

πŸ“˜

Finish set up before you start your source connector!

Configure your Validators before you start your Source connector, to avoid premature reading of data.