Pre-requisite

Validio currently supports ingesting a Kinesis record in JSON format.

You will need:

  • An IAM user with permissions to access the specified Kinesis stream
  • Credentials for the IAM user. Either:
    • Access key and secret key
    • Temporary security credentials token

IAM user

It is recommended you create an IAM user with read-only access for the Validio platform to ingest data from your Kinesis stream. Details of these permissions can be found on Kinesis here.

The following permissions are required:

  • Action DescribeStreamSummary; access level Read; resource type stream
  • Action ListShards; access level List
  • Action GetShardIterator; access level Read; resource type stream
  • Action GetRecords; access level Read; resource type stream

Credentials

You must supply either an access key and secret key or a temporary security credentials token.

  • Information on obtaining an access key and secret key can be found here
  • Information on how to obtain a temporary security credentials token can be found here

Kinesis configuration parameters

Field Required Description Examples
Name Identifier for the connector. Used when setting up pipelines. East coast weather forecast
Region Region that the Kinesis stream is available in. eu-north-1
Access key ✅* Required if not using temporary security credentials token. The access key used in combination with the secret key for the Validio platform to access the stream with. AKUEBNDJHA7BJSHJVA6F
Secret Key ✅* Required if not using temporary security credentials token. The secret key used in combination with the access key for the Validio platform to access the stream with. aslk/KOJHJKBuibhj

2jq3csaVGHYTFgvhc

ascuyg

Token ✅* Required if not using access/private key. The token for the Validio platform to access the stream with.
Endpoint Endpoint of AWS service. It is useful for testing with Amazon Localstack
Stream name Name of the stream to ingest data from. You can find it in AWS console. east-coast-weather
Shard iterator type How to start reading data from the stream: At timestamp - read all messages from Shard iterator timestamp; Latest - read new messages are published after start reading the stream; Trim horizon - read all messages in the stream
Shard iterator timestamp Required if Shard iterator type is equal At timestamp.
Enhanced Fan-out Use a dedicated throughput to consume data from the stream. Learn more about it here

[*] Depending on which method of credentials are used

Inferring a schema

Validio helps to infer a schema of your source connector based on your data. To ensure automatic schema inference works, make sure there are events in the stream when you connect your source.

A stream may be empty from Validio's perspective in two cases:

  • A steam has no events when connecting to Validio, e.g. events has been deleted according to the retention period and no new events have been published
  • A shard iterator type may mention the position of a stream without events after it. For example, Validio will read no events if the Latest shard iterator type is chosen and no new events are published

Another option is to prepare an example of stream data in a JSON file and upload it in the set-up process for schema inference.