You will need:
- A service account with permissions to access the specified dataset and table
- A base64 encoded service account key in JSON format
It is recommended you create a service account with read-only access for the Validio platform to ingest data from your BigQuery table. Details of these permissions and roles can be found on GCP here.
The service account needs to be assigned the following roles:
- BigQuery Data Viewer (
- BigQuery Job User (
- Obtain a service account key in JSON form for the service account. GCP instructions can be found here.
- Encode the service account key in base64
Description of the fields that can be configured when setting up a BigQuery connector:
|Name||✅||Identifier for the connector. Used when setting up pipelines||East_coast_weather_forecast|
|Credentials||✅||Base64 encoded form of the service account key in the JSON format|
|Project id||✅||Name of BigQuery project||weather-forecast|
|Dataset id||✅||Name of the dataset where the table is included||east-coast|
|Table name||✅||Name of the table in the dataset to ingest data from. You can find a whole ID of the table including project ID, dataset ID and table name in the details page of the table in the GCP console||train-data|
|Incremental column name||✅||The name of the column that will help the Validio platform identify and determine what records have not been read already. This can be an auto-incrementing column of type integer or a datetime/timestamp||updated_timestamp|
|Polling interval value||✅||How often to query the database for new data. This value is combined with the unit in order to create a polling interval (e.g. a polling interval value of 2 with a unit of “hours” will poll every two hours)||2|
|Unit||✅||The time unit used for the polling interval value||Hour|
- RECORD data types and REPEATABLE columns are encoded as STRING when processing.
Updated about 1 month ago