Documentation
HomeRequest DemoContact

VPC deployment

Install Validio on customer VPC. The Validio application is distributed as a Helm chart. The Helm chart is installable on managed Kubernetes clusters.

Validio is available for the following cloud providers:

  • Google Cloud Platform (GCP): GKE.
  • Amazon Web Services (AWS): EKS.
  • Microsoft Azure: AKS.

Prerequisites

📘

Note

Installing Validio on Customer VPC requires knowledge of:

  • Kubernetes: how to install, configure, and monitor resources on AWS, GCP or Azure.
  • Helm: how to install and configure helm charts on Kubernetes.

1. Access to docker images using a JSON file provided by Validio. For more information, contact us.

2. Configure a Kubernetes cluster on GCP, AWS or Azure. Considering the following:

  1. All nodes must be in the same AWS availability zone, GCP zone or Azure zone.
  2. Persistent volume provisioning enabled, to create Kubernetes PVC resources.
  3. We also recommended a configured load balance or ingress controller, such as:
    1. GCP Ingress Controller
    2. AWS Load Balancer Controller
    3. AKS Application routing add-on

3. Install the following:

Installation

1. Create namespace

Create the namespace you want to use for your configuration. In our examples, we use validio.

kubectl create namespace validio

2. Add the docker-registry secret

Create a Kubernetes Secret to pull the docker images required by the helm chart. This requires the JSON file provided by Validio.

First, set the PULL_CONTAINERS_KEY environment variable to point to your JSON file:

export PULL_CONTAINERS_KEY=/home/user/Downloads/validio-docker-registry.json

Then, add the Kubernetes Secret:

kubectl -n validio create secret docker-registry artifact-registry --docker-server=https://europe-docker.pkg.dev --docker-username=_json_key --docker-password="$(cat ${PULL_CONTAINERS_KEY})" 

3. Install the helm chart

📘

The helm chart installation requires the validio-values.yaml file. For details, refer to configuration and examples.

helm install validio oci://europe-docker.pkg.dev/validio-platform-prod/charts/validio --version [your-validio-version] --namespace validio --values validio-values.yaml

4. Upgrade Validio version

📘

Note

Make sure you use the correct validio-values.yaml file when upgrading Validio. Otherwise, you risk changing your configuration.

helm upgrade validio oci://europe-docker.pkg.dev/validio-platform-prod/charts/validio --version [your-validio-version] --namespace validio --values validio-values.yaml

5. Configure the validio-values.yaml file

The Helm chart supports the following keys in the validio-values.yaml file:

KeyDefault valueDescription
meilisearch.resources.*nilResources for Meilisearch. Suggested values: memory 1000Mi, cpu 200m.
meilisearch.storage.capacitynilStorage capacity for Meilisearch . Suggested value: 10Gi
meilisearch.storage.classnilStorage class for Meilisearch . For example, premium-rwo for GCP and gp2 for AWS. Leave empty to use cloud provider default.
postgres.resources.*nilResources for Postgres. Suggested values: memory 4Gi and cpu 1000m.
postgres.storage.capacitynilStorage capacity for Postgres. Suggested value: 50Gi.
postgres.storage.classnilStorage class for Postgres. For example, premium-rwo for GCP and gp2 for AWS. Leave empty to use cloud provider default.
redis.resources.*nilResources for Postgres. Suggested values: memory 500Mi and cpu 200m.
redis.storage.capacitynilStorage capacity for Redis. Suggested values: 20Gi.
redis.storage.classnilStorage class for Redis. For example, premium-rwo for GCP and gp2 for AWS. Leave empty to use cloud provider default.
seastar.probes.*nilControl the Kubernetes Readiness and Liveness probes.
seastar.resources.*nilResources for Seastar . Suggested values: memory 1000Mi, cpu 200m.
surface.env.http_proxynilSet proxy for HTTP requests, will be used by Sentry
surface.env.https_proxynilSet proxy for HTTPS requests, will be used by Sentry
surface.ingress.enabledfalseSet to true to enable Kubernetes ingress. Note: This requires an ingress controller and that all ingress parameters are set accordingly.
surface.ingress.hostnilThe ingress hostname. Note: You must also set a DNS name.
surface.ingress.path/*Set the ingress path. For Nginx and Azure, / is required.
surface.ingress.annotationsnilSet the ingress annotations.
surface.ingress.managedCertfalseCreate a ManagedCertificate resource for the ingress. Note: For GCP only.
surface.ingress.tls.enablefalseEnable Ingress TLS, required for use in Azure with cert-manager
surface.probes.*nilControl the Kubernetes Readiness and Liveness probes.
surface.resources.*nilResources for Surface. Suggested values: memory 1000Mi, cpu 200m.
ve.diver.env.http_proxynilSet proxy for HTTP requests when sending notifications to Slack, Microsoft Teams, and Webhook.
ve.diver.env.https_proxynilSet proxy for HTTPS requests when sending notifications to Slack, Microsoft Teams, and Webhook.
ve.diver.probes.*nilControl the Kubernetes Readiness and Liveness probes.
ve.diver.resources.*nilResources for Diver. Suggested values: memory 500Mi, cpu 200m.
ve.ingress.probes.*nilControl the Kubernetes Readiness and Liveness probes.
ve.ingress.resources.*nilResources for Ingress. Suggested vlaues: memory 1000Mi, cpu 500m. Note: This is not a Kubernetes ingress parameter.
ve.ingress.cache.size1GiLocal disk cache for Ingress. Will be allocated on the host where the pod runs.
ve.pipelines.probes.*nilControl the Kubernetes Readiness and Liveness probes.
ve.pipelines.resources.*nilResources for Pipelines. Suggested values: memory 2Gi, cpu 1000m.
ve.pipelines.cache.size1GiLocal disk cache for Pipelines. Will be allocated on the host where the pod runs.

If the default value is false or true, the value type is bool. For all others, the value type is String.

All <service>.resources.* keys supports both CPU and Memory resources, specified only as <service>.resource.*.

Available keys for each service:

<service>:
  resources:
    limits:
      cpu: <value>
      memory: <value>
    requests:
      cpu: <value>
      memory: <value>

📘

Note

The following are memory and CPU recommendations:

  • Set a limit for memory use with <service>.resources.limits.memory.
  • Only set requests for CPU resources with <service>.resources.requests.cpu.

For more information, refer to Kubernetes resources documentation.

Most services also supports controlling the Kubernetes Liveness and Readiness probes. They are specified above in the table as <service>.probes.*. Available keys are listed below, note that they are enabled by default and the commented values are the default values, so no need to make any changes if you are happy with the default values.

<service>:
  probes:
    liveness:
      enabled: true
      # failureThreshold: 3
      # initialDelaySeconds: 0
      # periodSeconds: 10
      # successThreshold: 1
      # terminationGracePeriodSeconds: 30
      # timeoutSeconds: 1
    readiness:
      enabled: true
      # failureThreshold: 3
      # initialDelaySeconds: 0
      # periodSeconds: 10
      # successThreshold: 1
      # terminationGracePeriodSeconds: 30
      # timeoutSeconds: 1

HTTP Proxy

Validio supports using HTTP proxy for outbound communication for Slack, MS Teams and Sentry. The URL:s for both http_proxy and https_proxy can be the same, it's internally in the application that they are used for http respectively https URL:s.

If you're using a proxy that needs whitelisting of destinations, here is a list of hosts that Validio will use:

hooks.slack.com
sentry.io
o4506020911316992.ingest.sentry.io
*.outlook.office.com

Here is an example configuration:

surface:
  env:
    http_proxy: http://proxy.example.com:8080
    https_proxy: http://proxy.example.com:8080
ve:
  diver:
    env:
      http_proxy: http://proxy.example.com:8080
      https_proxy: http://proxy.example.com:8080

Sign in to the Validio platform

The first time you sign in to the Validio platform, you must use the admin credentials created in the Kubernetes Secret.

🚧

Caution

Do not change or update anything in the Kubernetes Secret. You can change the password in the Validio platform.

kubectl -n validio get secrets validio-postgres -o=jsonpath='{.data.validio_app_auth_password}' | base64 -d ; echo

Access your Validio platform:

In your browser, navigate to the hostname specified in the surface.ingress.host key. For example, https://validio.example.com/

If you did not configure an ingress, use port-forward to forward the Validio platform to your localhost:

kubectl -n validio port-forward svc/validio-surface 8889

Then, you can find it on http://localhost:8889/

Services

The Validio services are listed as follows. We recommend that you use this list as a reference when you allocate resources to services running in your system.

Validation Engine

The Validation Engine is the backend application in Validio. The Validation Engine consists of the following services:

  • Diver - Orchestrator and running background tasks for the Validio application.
  • Ingress - connects to configured sources to read statistics, data, and source metadata.
  • Pipelines - handles processing of any read data and performs calculations and anomaly detection on data statistics.

Surface

Surface is the "backend for the frontend". This service is responsible for communication between the frontend application in your browser and the Validation engine.

Seastar

Seastar is for searching and finding information in the frontend, uses Meilisearch as backend.

Postgres

Postgres governs all artifacts and configuration settings.

Redis

Redis is used as a transport layer between different services.

Meilisearch

Meilisearch is used as search engine for Surface

Flux GitOps tools

Optionally, you can use Flux to control your GitOps flow. For more information, refer to Flux Helm OCI repository.

Examples

Example of GCP ingress

surface:
  ingress:
    enabled: true
    host: validio.example.com
    managedCert: true
    annotations:
      kubernetes.io/ingress.allow-http: "false"
      kubernetes.io/ingress.class: gce

This example creates a Kubernetes Ingress resource and a ManagedCertificate resource, with the specified hostname. Make sure that the created DNS A record points to the IP address of the load balancer. Without a DNS A record, the certificate is not created and the Kubernetes Ingress won’t work.

Use the following command to verify the IP address of the load balance:

kubectl -n validio get ingress validio-surface -o 
jsonpath='{.status.loadBalancer.ingress[0].ip}'

Example of AWS ingress

surface:
  ingress:
    enabled: true
    host: validio.example.com
    annotations:
      alb.ingress.kubernetes.io/certificate-arn: <ARN of certificate>
      alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
      alb.ingress.kubernetes.io/scheme: internet-facing
      alb.ingress.kubernetes.io/success-codes: 200,302
      alb.ingress.kubernetes.io/target-type: ip
      kubernetes.io/ingress.class: alb

This requires the AWS Load Balancer Controller Add-On. This example creates a Kubernetes Ingress resource with the specified hostname. You must first create a certificate with AWS Certificate Manager as specified in the annotations part. Then, use the Alias option to create a DNS A record that points to the endpoint of the load balancer.

Use the following command to verify the endpoint of the load balancer:

kubectl -n validio get ingress validio-surface -o 
jsonpath='{.status.loadBalancer.ingress[0].hostname}'

Example of Azure ingress

surface:
  ingress:
    enabled: true
    host: validio.example.com
    annotations:
      cert-manager.io/acme-challenge-type: http01
      cert-manager.io/cluster-issuer: letsencrypt
    ingressClassName: webapprouting.kubernetes.azure.com
    tls:
      enabled: true
    path: /

This requires the AKS Application routing add-on with a workin external-dns setup and cert-manager for TLS . This example creates a Kubernetes Ingress resource with the specified hostname using TLS.

Example of Nginx Ingress

Configuring an Nginx Ingress can be done in a few different ways, here we will show just the basic options needed. For your environment you might need to add specific annotations local to your Kubernetes installation.

surface:
  ingress:
    enabled: true
    host: validio.example.com
    path: /

Example of default memory and cpu resources set

🚧

Caution

Below is an example configuration using the above suggested resource configuration and some other sensible values. The YAML key surface.ingressbelow are GCP, AWS or Azure specific, please refer to examples above. Validio will provide customer specific configuration when applicable

meilisearch:
  resources:
    limits:
      memory: 1000Mi
    requests:
      cpu: 200m
  storage:
    capacity: 10Gi
    class: managed-premium
postgres:
  resources:
    limits:
      memory: 4000Mi
    requests:
      cpu: 200m
  storage:
    capacity: 50Gi
    class: managed-premium
redis:
  resources:
    limits:
      memory: 500Mi
    requests:
      cpu: 200m
  storage:
    capacity: 20Gi
    class: managed-premium
seastar:
  resources:
    limits:
      memory: 1000Mi
    requests:
      cpu: 200m
surface:
  ingress:
    enabled: true
    host: example.validio.io
  resources:
    limits:
      memory: 1000Mi
    requests:
      cpu: 200m
  service:
    type: ClusterIP
ve:
  diver:
    resources:
      limits:
        memory: 500Mi
      requests:
        cpu: 200m
  ingress:
    cache:
      size: 10G
    resources:
      limits:
        memory: 1000Mi
      requests:
        cpu: 200m
  pipelines:
    cache:
      size: 10G
    resources:
      limits:
        memory: 1000Mi
      requests:
        cpu: 200m