Google Cloud Storage (GCS)

Prerequisites for Google Cloud Storage

Certain credentials and permission are required for the service account in Google Cloud Console for Validio to validate your data:

  • A service account with access and permissions to the specified GCS bucket.
  • A JSON file containing the service account key.

For more information, refer to GCP - Introduction to IAM.

Create a Service Account and Assign Roles

We recommend that you create a service account where the Validio platform is granted access to the GCS bucket you want to read data from.

The following roles must be assigned to the service account:

  • storage.buckets.get
  • storage.objects.get
  • storage.objects.list
  • Server Usage Consumer - (Optional) This role must be granted on the GCP project service account if it will be used for quota and billing purposes.

Create a Service Account Key

  • Obtain a service account key in JSON file format for your service account.
    For details, refer to Create and delete service account keys.
  • Provide the service account to the Credentials field in Validio, either by uploading the JSON file or copy and pasting the contents of the JSON file into the field provided. For more information, see Add a Google Cloud Credential.

Add a Google Cloud Credential

To add a credential for Google Cloud credential,

  1. Navigate to Credentials and click + New Credential.
  2. Under Namespace, select a namespace where the resources will be created.
  3. For Credential Type, select GCP Credential.
  4. Fill in the credential parameter fields. Refer to the Google Cloud Credential Parameters table.
  5. Check Use for catalog to automatically discover credentials and add them to the catalog page.
  6. Click Create.

Validio will validate the connection to the Google Cloud account. If validation passes, Validio will automatically start fetching data. If validation fails, check that you provided the correct parameter values and try again.

Google Cloud Credential Parameters

Field

Description

Name

Identifier for the credentials. Used when accessing Sources.

Billing project ID

(Optional) Specify the project to use for quota and billing purposes, if it is different from the resource project.

Note: The service account must have the Server Usage Consumer role.

Service account JSON-file

Upload the JSON file, or paste the content of the JSON file containing the service account key.

For details, refer to Create and delete service account keys.

Source Configuration Parameters

Field

Description

Example

Project id

Name of the GCS project.

weather-forecast

Bucket

Name of the GCS bucket that contains the folder

east-coast

Folder

Name of the folder to read data from.

train-data

File pattern

(Optional) Filter what files to read, based on file names and regex expressions.

File format

Select the type of file: CSV, Parquet, or JSON.

For CSV file formats, specify the delimiter used in the CSV file and (optional) the character or string used to represent a null value.

CSV delimiter: ,\
Null marker: NULL