-
Notifications
You must be signed in to change notification settings - Fork 300
Common_Problems
While we've made it a priority to make Athena Query Federation user friendly, there can be many integration points between your data source and Athena which need to work together correctly. This page documents some of the most common issues we've experienced while building and supporting this functionality.
Customers often federate across 1 or more VPCs. When running your connector in a VPC, you must ensure that your Lambda function is able to communicate with its dependencies. This often includes the below items but is dependent on the connector your are using:
- Amazon S3 - for spilling large responses.
- AWS Glue DataCatalog - for metadata (if your connector is Glue enabled).
- Amazon Athena - for checking query status and preventing overscan.
- AWS Secrets Manager - for resolving any secrets (e.g. password) that you need for your connector.
- AWS KMS - for generating data keys to encrypt large responses that spill to S3. (you can optionally use a local key source)
- Ability to contact your source system (e.g. MySQL, Hbase master, etc...) - for federating to the source system.
Most of the time you can resolve the above issues by adding VPC endpoints for the required services or running your connector in a VPC that has an internet gateway. The way you solve this problem is dependent on your network topology. The safest option is usually adding a VPC endpoint since that doesn't require internet access.
If you plan to query more than ~6MB of data at a time your connector will need a S3 Bucket and Prefix to spill response data to. This data is encrypted by default and can optionally use KMS for data keys. One of the more common problems we encounter is someone mistyping their S3 bucket or failing to ensure that the user they run their Athena queries as has access to that location. Each query uses its own encryption key and the data can be deleted after the query completes.