OpenLineage

Ingest lineage events and metadata from OpenLineage-compatible tools into Validio.

Validio supports ingesting lineage events from OpenLineage-compatible tools. When your data pipelines emit OpenLineage events, Validio processes them to automatically create catalog assets, establish lineage relationships, and import dataset metadata.

This enables lineage visibility for pipelines orchestrated by tools such as Apache Airflow, Apache Spark, dbt, Great Expectations, and any other tool that implements the OpenLineage standard.

Prerequisites

  • An API key with catalogAssets:WRITE permission. See Managing API Keys for how to create and manage API keys.

Configure Your OpenLineage Producer

Point your OpenLineage-compatible tool at the Validio endpoint. The configuration depends on your producer, but all use the same endpoint URL and authentication.

Endpoint

POST https://<your-validio-domain>/openlineage/v1/lineage

Replace <your-validio-domain> with your Validio instance URL.

Authentication

Include your API key credentials in the Authorization header. The credentials are always in the format <accessKeyId>:<secretAccessKey>. Validio supports three header formats:

FormatHeader ValueDescription
Raw<accessKeyId>:<secretAccessKey>Recommended. Send credentials directly as the header value.
BasicBasic <base64 of accessKeyId:secretAccessKey>Standard HTTP Basic authentication with base64-encoded credentials.
BearerBearer <accessKeyId>:<secretAccessKey>Bearer token format with raw credentials.

For producer-specific configuration, consult the OpenLineage integrations documentation and configure the HTTP transport with the Validio endpoint URL and API key authentication shown above.

What Validio Processes

When Validio receives an OpenLineage event, it extracts:

  • Lineage relationships — Input and output datasets from the event are mapped to catalog assets, and lineage edges are created between them.
  • Dataset metadata — Descriptions, tags, and other metadata from OpenLineage facets are imported and associated with the corresponding catalog assets.
  • Run state — Event timestamps, run identifiers, and event types (START, COMPLETE, FAIL) are tracked for observability.

How It Appears in Validio

Catalog Assets

Assets discovered through OpenLineage events appear in your Catalog alongside assets from other credential types. The asset namespace reflects the OpenLineage dataset namespace (e.g., postgres://host:5432/db, snowflake://account/db).

Lineage

Lineage edges created from OpenLineage input/output relationships are visible in the Lineage graph. These edges behave like any other lineage edge — you can inspect them, add descriptions, and use them for glossary term propagation.

Descriptions

Descriptions imported from OpenLineage metadata appear with the OpenLineage origin label, alongside descriptions from other sources such as dbt and Catalog Refresh. OpenLineage descriptions are read-only. To customize a description, edit it and a Validio copy is created that takes priority in the display.

For more information about description origins, see Catalog Assets.

Related Resources