HomeDocumentationRecipesChangelog
HomeRequest DemoContact
Documentation
HomeRequest DemoContact

Amazon Athena

Create a source for Amazon Athena.

Amazon Athena is used as a query engine for Amazon S3 files.

Prerequisites

In AWS Management Console, prepare the followingcredentials and permission for Validio to validate your data:

  • An IAM user with access and permissions to query data using Athena.
  • Credentials for the IAM user.

Access key

You must supply an Access key and Secret key to authenticate to AWS. For more information, refer to Managing access keys for IAM users.

Permissions

The following permissions must be assigned to your IAM user. For more information, refer to Identity and access management in Athena.

Athena

  • athena:ListDataCatalogs
  • athena:ListDatabases
  • athena:ListTableMetadata
  • athena:ListQueryExecutions
  • athena:GetDataCatalog
  • athena:StartQueryExecution
  • athena:StopQueryExecution
  • athena:GetQueryExecution
  • athena:GetQueryResults

Glue

  • glue:GetTables
  • glue:GetTable
  • glue:BatchGetPartition
  • glue:GetDatabase
  • glue:GetDatabases
  • glue:GetPartition
  • glue:GetPartition

S3: Read permissions on both data source bucket and query results bucket

  • s3:GetBucketLocation
  • s3:GetObject
  • s3:ListBucket
  • s3:ListBucketMultipartUploads
  • s3:ListMultipartUploadParts

S3: Write permissions to query result bucket

  • s3:PutObject
  • s3:AbortMultipartUpload

Add an Athena Credential

To add a credential for Amazon Athena,

  1. Navigate to Credentials and click + New Credential.

  2. Under Namespace, select a namespace where the resources will be created.

  3. For Credential Type, select AWS Athena Credential.

  4. Fill in the credential parameter fields. Refer to the Athena Credential Parameters table.

  5. Check Use for catalog to automatically discover credentials and add them to the catalog page.

  6. Click Create.
    Validio will validate the connection to the Athena account. If validation passes, Validio will automatically start fetching data. If validation fails, check that you provided the correct parameter values and try again.

    Once the credential is created, you can add a source to monitor Athena data.

Athena Credential Parameters

FieldDescriptionExample
NameIdentifier for the credentials. Used when accessing Sources.service_acount_product_staging
Access keyAccess key for AWS authentication.AKIAIOSFODNN7EXAMPLE
Secret keySecret key for the specified access key.wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
AWS regionRegion from which Athena should run queries.eu-central-1
Query result locationLocation where to store query results.s3://myathenabucket/results

Add an Athena Source

To add a source for Athena,

  1. Navigate to Sources and click + New source.
  2. Under Source type, select Amazon Athena.
  3. Under Config,
    1. Select the valid Credential or create a new credential to authenticate your connection to the data warehouse.
    2. Enter the Catalog, Database, and Table to specify where the data comes from. Selecting more than one table will create a new source for each table. Refer to the Source Configuration Parameters table.
    3. Set how many days of Historic data to use when you start the source.
    4. Set the Polling schedule, which is how frequently the validators on the source will check for changes.
  4. Under Schema, click Continue to automatically infer the schema fields from the tables you selected. If you select many tables, this operation can take a few minutes to complete.
  5. Under Source details,
    1. Add Tags to help group related sources or to use for routing notifications.
    2. Add an Owner who will be the contact for incident notifications.
  6. Click Continue to create the source.
    Source names are generated automatically and will be displayed when the source creation completes. If there are more than 5 sources, you will see the names for the first five and a count of the remaining sources.

Source Configuration Parameters

FieldDescription
CatalogName of the catalog. This is sometimes called Data source.
DatabaseName of the database. This is sometimes called schema.
TableName of the table with data to validate.