Consider the following when you read and validate records from a Data stream:
- In each stream, all messages must be consistent with the declared schema:
- Fields added after creating the connector are ignored.
- Missing fields are interpreted as empty fields, which have consequences on the analytics involving those fields.
The costs associated with reading data from Data streams:
- Each Source in Validio corresponds to one consumer of your Data stream.
- If the traffic crosses cloud regions, there are potential network costs between Validio and the Data stream.
Based on your data, Validio helps to infer a schema for your source. The schema inference requires that the Data stream is not empty when you connect your Source.
The Data stream can be considered empty by the Validio platform in the following circumstances:
- A Data steam has no events when connecting to Validio. For example, all events are deleted according to the retention period, and no new events have been published.
- Validio infers fields with the
Timestampdatatype in the schema. These suggests, for example, when a event or message is created or published in the Data stream.
- For Pub/sub and Pub/sub lite, Validio also infers the
validio_publish_timefield in the schema. The
validio_publish_timefield contains the timestamp that Pub/sub generates when a message is published to the stream. The timestamp is in RFC3339 UTC "Zulu" format.
Updated 8 months ago