OpenLineage
Ingest lineage events and metadata from OpenLineage-compatible tools into Validio.
Validio supports ingesting lineage events from OpenLineage-compatible tools. When your data pipelines emit OpenLineage events, Validio processes them to automatically create catalog assets, establish lineage relationships, and import dataset metadata.
This enables lineage visibility for pipelines orchestrated by tools such as Apache Airflow, Apache Spark, dbt, Great Expectations, and any other tool that implements the OpenLineage standard.
Prerequisites
- An API key with
catalogAssets:WRITEpermission. See Managing API Keys for how to create and manage API keys.
Configure Your OpenLineage Producer
Point your OpenLineage-compatible tool at the Validio endpoint. The configuration depends on your producer, but all use the same endpoint URL and authentication.
Endpoint
POST https://<your-validio-domain>/openlineage/v1/lineageReplace <your-validio-domain> with your Validio instance URL.
Authentication
Include your API key credentials in the Authorization header. The credentials are always in the format <accessKeyId>:<secretAccessKey>. Validio supports three header formats:
| Format | Header Value | Description |
|---|---|---|
| Raw | <accessKeyId>:<secretAccessKey> | Recommended. Send credentials directly as the header value. |
| Basic | Basic <base64 of accessKeyId:secretAccessKey> | Standard HTTP Basic authentication with base64-encoded credentials. |
| Bearer | Bearer <accessKeyId>:<secretAccessKey> | Bearer token format with raw credentials. |
For producer-specific configuration, consult the OpenLineage integrations documentation and configure the HTTP transport with the Validio endpoint URL and API key authentication shown above.
What Validio Processes
When Validio receives an OpenLineage event, it extracts:
- Lineage relationships — Input and output datasets from the event are mapped to catalog assets, and lineage edges are created between them.
- Dataset metadata — Descriptions, tags, and other metadata from OpenLineage facets are imported and associated with the corresponding catalog assets.
- Run state — Event timestamps, run identifiers, and event types (START, COMPLETE, FAIL) are tracked for observability.
How It Appears in Validio
Catalog Assets
Assets discovered through OpenLineage events appear in your Catalog alongside assets from other credential types. The asset namespace reflects the OpenLineage dataset namespace (e.g., postgres://host:5432/db, snowflake://account/db).
Lineage
Lineage edges created from OpenLineage input/output relationships are visible in the Lineage graph. These edges behave like any other lineage edge — you can inspect them, add descriptions, and use them for glossary term propagation.
Descriptions
Descriptions imported from OpenLineage metadata appear with the OpenLineage origin label, alongside descriptions from other sources such as dbt and Catalog Refresh. OpenLineage descriptions are read-only. To customize a description, edit it and a Validio copy is created that takes priority in the display.
For more information about description origins, see Catalog Assets.
Related Resources
Updated 4 days ago