HomeDocumentationRecipesChangelog
HomeRequest DemoContact
Documentation
HomeRequest DemoContact

ClickHouse

ClickHouse is an open-source column-oriented database management system that uses SQL queries to generate reports.

This integration supports the following semi-structured datatypes from ClickHouse: Tuple (and named tuples), Nested, and Array. For more information, see Tuple, Nested data structures, and Array in the ClickHouse SQL reference.

Prerequisites for Integrating with ClickHouse

Any ClickHouse user account with permissions granted to access the relevant databases and tables. For more information about access rights, see ClickHouse documentation.

โ—๏ธ

Credential Permission Requirements

Validio Credentials require VIEWER access rights when connecting to sources to read and access data. Admins must ensure that they do not provide EDITOR access rights to their credentials.

You can use the following SQL script to set up a ClickHouse User and Role and grant the necessary permissions for the Source in Validio.

  • The user profile has readonly permissions, which means that the user can only execute read queries (such as SELECT).
  • The role is also granted remote permissions to fetch the query logs needed to build the lineage relationships.
CREATE ROLE IF NOT EXISTS validio_role;

CREATE USER IF NOT EXISTS validio_user
    IDENTIFIED WITH sha256_password BY '**************'
    SETTINGS PROFILE 'readonly'
    DEFAULT ROLE validio_role;

GRANT SELECT, SHOW ON mydatabase.* TO validio_role;
GRANT SELECT ON system.query_log TO validio_role;
GRANT REMOTE ON *.* to validio_role;

For more information about ClickHouse permissions for queries, see ClickHouse documentation.

๐Ÿ“˜

Note

Relative entropy validators may require more resources to execute on ClickHouse dev environments. For example, if you plan to use a relative entropy validator, you might see errors due to limited RAM.

Add a ClickHouse Credential

To add a credential for ClickHouse,

  1. Navigate to Credentials and click + New Credential.
  2. Under Namespace, select a namespace where the resources will be created.
  3. For Credential Type, select ClickHouse Credential.
  4. Fill in the credential parameter fields. Refer to the ClickHouse Credential Parameters table.
  5. Check Use for catalog to automatically discover credentials and add them to the catalog page.
  6. Click Create.

Validio will validate the connection to the ClickHouse account. If validation passes, Validio will automatically start fetching data. If validation fails, check that you provided the correct parameter values and try again.

Once the credential is created, you can add a source to monitor ClickHouse data.

ClickHouse Credential Parameters

Field

Description

Name

The identifier for the credential, which will be used when adding sources.

Protocol

Select the ClickHouse client interface: Native (ClickHouse binary), HTTP, or HTTPS.

Host

DNS hostname or IP address to the ClickHouse server.

Port

Port number of the Clickhouse server. Depending on the protocol, the defaults ports are: 9000 (Native), 8123 (HTTP), 8443 (HTTPS). See the list of default network ports in ClickHouse documentation.

Username

Username of the ClickHouse user account with adequate permissions to access the specified database. See creating users and roles in ClickHouse documentation.

Password

Password for the ClickHouse user account.

Default Database

Name of the default ClickHouse database where the table to read is included.

  • *Note:** The default database is required for creating the credential, but you can reference another database when creating a ClickHouse source.

Add a ClickHouse Source

To add a source for ClickHouse,

  1. Navigate to Sources and click + New source.
  2. Under Source type, select ClickHouse.
  3. Under Config,
    1. Select the valid Credential or create a new credential to authenticate your connection to the data warehouse.
    2. Enter the Database, Schema, and Table to specify where the data comes from. Selecting more than one table will create a new source for each table.
    3. Set how many days of Historic data to use when you start the source.
    4. Set the Polling schedule, which is how frequently the validators on the source will check for changes.
  4. Under Schema, click Continue to automatically infer the schema fields from the tables you selected. If you select many tables, this operation can take a few minutes to complete.
  5. Under Source details,
    1. Add Tags to help group related sources or to use for routing notifications.
    2. Add an Owner who will be the contact for incident notifications.
  6. Click Continue to create the source.
    Source names are generated automatically and will be displayed when the source creation completes. If there are more than 5 sources, you will see the names for the first five and a count of the remaining sources.