Amazon Athena
Create a source for Amazon Athena.
Amazon Athena is used as a query engine for Amazon S3 files.
Prerequisites
In AWS Management Console, prepare the followingcredentials and permission for Validio to validate your data:
- An IAM user with access and permissions to query data using Athena.
- Credentials for the IAM user.
Access key
You must supply an Access key and Secret key to authenticate to AWS. For more information, refer to Managing access keys for IAM users.
Permissions
The following permissions must be assigned to your IAM user. For more information, refer to Identity and access management in Athena.
Athena
athena:ListDataCatalogs
athena:ListDatabases
athena:ListTableMetadata
athena:ListQueryExecutions
athena:GetDataCatalog
athena:StartQueryExecution
athena:StopQueryExecution
athena:GetQueryExecution
athena:GetQueryResults
Glue
glue:GetTables
glue:GetTable
glue:BatchGetPartition
glue:GetDatabase
glue:GetDatabases
glue:GetPartition
glue:GetPartition
S3: Read permissions on both data source bucket and query results bucket
s3:GetBucketLocation
s3:GetObject
s3:ListBucket
s3:ListBucketMultipartUploads
s3:ListMultipartUploadParts
S3: Write permissions to query result bucket
s3:PutObject
s3:AbortMultipartUpload
Add an Athena Credential
To add a credential for Amazon Athena,
-
Navigate to Credentials and click + New Credential.
-
Under Namespace, select a namespace where the resources will be created.
-
For Credential Type, select AWS Athena Credential.
-
Fill in the credential parameter fields. Refer to the Athena Credential Parameters table.
-
Check Use for catalog to automatically discover credentials and add them to the catalog page.
-
Click Create.
Validio will validate the connection to the Athena account. If validation passes, Validio will automatically start fetching data. If validation fails, check that you provided the correct parameter values and try again.Once the credential is created, you can add a source to monitor Athena data.
Athena Credential Parameters
Field | Description | Example |
---|---|---|
Name | Identifier for the credentials. Used when accessing Sources. | service_acount_product_staging |
Access key | Access key for AWS authentication. | AKIAIOSFODNN7EXAMPLE |
Secret key | Secret key for the specified access key. | wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY |
AWS region | Region from which Athena should run queries. | eu-central-1 |
Query result location | Location where to store query results. | s3://myathenabucket/results |
Add an Athena Source
To add a source for Athena,
- Navigate to Sources and click + New source.
- Under Source type, select Amazon Athena.
- Under Config,
- Select the valid Credential or create a new credential to authenticate your connection to the data warehouse.
- Enter the Catalog, Database, and Table to specify where the data comes from. Selecting more than one table will create a new source for each table. Refer to the Source Configuration Parameters table.
- Set how many days of Historic data to use when you start the source.
- Set the Polling schedule, which is how frequently the validators on the source will check for changes.
- Under Schema, click Continue to automatically infer the schema fields from the tables you selected. If you select many tables, this operation can take a few minutes to complete.
- Under Source details,
- Add Tags to help group related sources or to use for routing notifications.
- Add an Owner who will be the contact for incident notifications.
- Click Continue to create the source.
Source names are generated automatically and will be displayed when the source creation completes. If there are more than 5 sources, you will see the names for the first five and a count of the remaining sources.
Source Configuration Parameters
Field | Description |
---|---|
Catalog | Name of the catalog. This is sometimes called Data source . |
Database | Name of the database. This is sometimes called schema . |
Table | Name of the table with data to validate. |
Updated 10 days ago