When creating a Source Connector, you will have the option to create a metadata connector. This connector uses the same 3rd party integration as the data connector but collects source-specific metadata instead of data.
The Metadata Connector allows for tracking of metrics such as null values, row count, schema changes etc.
The Metadata Connector is treated conceptually just like a normal Source Connector, meaning dataset- and datapoint pipelines can be applied to it as well as Monitors and Filters
Since the metadata connector fetches metadata produced by the 3rd party sources themselves, the metadata features that can be monitored will be slightly different depending on the source. Please visit each dedicated source page for supported metadata features for each source.
Below is an example of supported metadata features in BigQuery:
|schema_changed||A boolean flag. It has a True value if a table schema is changed.|
|freshness||The last date and time of data changes in the table.|
|last_hour_rate_per_second||The ratio of rows that were valid relative to successfully ingested rows the last hour.|
|last_hour_invalid_ratio||The ratio of rows that were invalid relative to successfully ingested rows the last hour.|
|last_day_rate_per_second||The ratio of rows that were valid relative to successfully ingested rows the last day.|
|last_day_invalid_ratio||The ratio of rows that were invalid relative to successfully ingested rows the last day.|
- Data warehouses and Object stores: Metadata is queried at the same time the actual data is, which is set by the polling interval parameter when sitting up a Source Connector
- Streaming: Currently metadata from streaming tools are polled every 60 seconds
Updated about 1 month ago