Filter Types

Boolean Filter

Filter records based on a boolean field.

Parameter name

Parameter values

Filter type

Boolean filter

Field

List of source fields with Boolean data types

Operator

Is true Is false


Enum Filter

Filter records based on whether field values are part of a defined set.

Parameter name

Parameter values

Field

List of source fields with string data types

Operator

Is in Is not in

Value

List of strings


Null Filter

Filters records on whether a field values are NULL.

Parameter name

Parameter value

Field

List of source fields

Operator

Is Is not

String Filter

Filter records on whether string field values match specific criteria.

Parameter name

Parameter values

Field

List of source fields with string data types

Operator

Is empty
Is not empty
Contains
Does not contain
Starts with
Ends with
Is exactly
Regex

Value

String

🚧

Supported Regex Syntax

Depending on source system, Validio leverages different regex syntax:

  • For Data Warehouses and Query Engines, regex filters are applied in the source system. For more information on that syntax, refer to the regex syntax of your specific source system.
  • For Data Streams, the regex syntax is compatible with the Rust regex library. For information on that syntax, refer to Rust - regex crate syntax.

Threshold Filter

Filter records on whether numeric values comply to a threshold.

Parameter name

Parameter values

Field

List of source fields with numeric data types

Operator 1

Equal to
Not equal to
Greater than
Greater than or equal to
Less than
Less than or equal to

Value

Float number

1Specifies the 'allowed region'. For example,Greater than or equal to 5 specifies that all values strictly less than five will be filtered.


SQL Filter

You define a SQL filter by typing a valid SQL WHERE statement into the text box, but omitting the word "WHERE". An example of a valid SQL filter is: some_column = 5.

SQL filter works for all Source types, including non-SQL data sources. Different data sources may use different SQL syntax. For more information, refer to the documentation of your data source. For non-SQL data sources, such as Data Streams or Object Storages, you find the supported functionality and syntax in the SQL filter expression reference.

SQL Filter Recipes

The following are recipes for common SQL filters. Each recipe is written in a generic syntax or in syntax specific for Google BigQuery or Snowflake, when required.

You can use these recipes as a starting point when writing SQL filters for other data sources, such as Amazon Redshift or Amazon Athena.

Filter operators

For example,some_column matches a filter with several operators.

some_column IS NULL OR (some_column > 3 AND some_column NOT IN(4,6))

Exists in set

For example,some_column match either of the values 'some_value' or 'some_other_value'.

some_column IN ( 'some_value', 'some_other_value' )

Missing values

For example,some_column is null or matches a value that represents a missing value.

some_column IS NULL OR LOWER(some_column) IN('', '0', '-', 'null')

Semi-structured data

For example,some_array contains at least one element with value 2.

2 IN(SELECT * FROM UNNEST(some_array))
array_contains(2, some_array)

String pattern

For example,some_column does not contain a valid URL.

NOT REGEXP_CONTAINS(some_column, '^(http(s)?:\\/\\/.)?[-a-zA-Z0-9@:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@:%_\\+.~#?&//=]*)')
NOT REGEXP_LIKE(some_column, '^(http(s)?:\\/\\/.)?[-a-zA-Z0-9@:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@:%_\\+.~#?&//=]*)')