-
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 7d40b81
Showing
51 changed files
with
4,811 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
PROFILE=default | ||
VERSION=latest | ||
REGION= | ||
CATALOG=AwsDataCatalog | ||
WORKGROUP=primary | ||
QUERY_OUTPUT= | ||
AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL1QUERIES=5 | ||
AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL1QUERIES=10 | ||
AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL2QUERIES=5 | ||
AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL2QUERIES=20 | ||
AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL3QUERIES=20 | ||
AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL3QUERIES=40 | ||
AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL4QUERIES=20 | ||
AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL4QUERIES=80 | ||
AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL5QUERIES=100 | ||
AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL5QUERIES=200 | ||
AWS_DEFAULT_SIMULTANEOUS_DDL_QUERIES=20 | ||
AWS_DEFAULT_SIMULTANEOUS_DML_QUERIES=20 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) 2021 Francois Chaumont | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
# Toolkit for AWS Athena API | ||
|
||
[![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/FrancoisChaumont/aws-athena-api-tools/issues) | ||
![GitHub release](https://img.shields.io/github/release/FrancoisChaumont/aws-athena-api-tools.svg) | ||
[![GitHub issues](https://img.shields.io/github/issues/FrancoisChaumont/aws-athena-api-tools.svg)](https://github.com/FrancoisChaumont/aws-athena-api-tools/issues) | ||
[![GitHub stars](https://img.shields.io/github/stars/FrancoisChaumont/aws-athena-api-tools.svg)](https://github.com/FrancoisChaumont/aws-athena-api-tools/stargazers) | ||
![Github All Releases](https://img.shields.io/github/downloads/FrancoisChaumont/aws-athena-api-tools/total.svg) | ||
|
||
## Introduction | ||
**What it does?** It allows you to do the following from the command line: | ||
- create/drop database | ||
- execute a single query | ||
- execute multiple queries simultaneously while remaining within your max rate limits | ||
- create partitions on non-hive or hive formatted data | ||
- get one or multiple queries current states | ||
- stop a running query | ||
- delete metadata files | ||
- create a named query | ||
- list & detail named queries | ||
- list & detail databases | ||
- list & detail database tables | ||
|
||
## Requirements | ||
- [PHP](https://www.php.net/releases/7_4_0.php) ^7.4 | ||
- [aws/aws-sdk-php](https://github.com/aws/aws-sdk-php) ^3.175 | ||
- [vlucas/phpdotenv](https://github.com/vlucas/phpdotenv) ^5.3 | ||
- [Composer](https://getcomposer.org) | ||
- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html) | ||
- AWS_ACCESS_KEY_ID & AWS_SECRET_ACCESS_KEY¹ | ||
|
||
> ¹ The SDK should detect the credentials from environment variables (via AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY), an AWS credentials INI file in your HOME directory, AWS Identity and Access Management (IAM) instance profile credentials, or credential providers | ||
## Installation | ||
Download a copy of this repository and run the following: | ||
``` | ||
composer install | ||
``` | ||
|
||
## Configuration | ||
Modify the following variables inside the file [.env](.env) for default values to use when related options are omitted | ||
- `PROFILE`: AWS profile from ~/.AWS/credentials | ||
- `VERSION`: AWS webservice version | ||
- `REGION`: AWS region to connect to | ||
- `CATALOG`: Athena data source catalog | ||
- `WORKGROUP`: Athena workgroup | ||
- `QUERY_OUTPUT`: S3 bucket for query results | ||
- `AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL1QUERIES`¹: level 1 queries max calls per second | ||
- `AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL1QUERIES`¹⁺⁰: level 1 queries max burst capacity | ||
- `AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL2QUERIES`²: level 2 queries max calls per second | ||
- `AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL2QUERIES`²⁺⁰: level 2 queries max burst capacity | ||
- `AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL3QUERIES`³: level 3 queries max calls per second | ||
- `AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL3QUERIES`³⁺⁰: level 3 queries max burst capacity | ||
- `AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL4QUERIES`⁴: level 4 queries max calls per second | ||
- `AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL4QUERIES`⁴⁺⁰: level 4 queries max burst capacity | ||
- `AWS_DEFAULT_MAX_CALLS_PER_SECOND_LEVEL5QUERIES`⁵: level 5 queries max calls per second | ||
- `AWS_DEFAULT_MAX_BURST_CAPACITY_LEVEL5QUERIES`⁵⁺⁰: level 5 queries max burst capacity | ||
- `AWS_DEFAULT_SIMULTANEOUS_DDL_QUERIES`⁶: max simultaneous DDL queries | ||
- `AWS_DEFAULT_SIMULTANEOUS_DML_QUERIES`⁷: max simultaneous DML queries | ||
|
||
¹BatchGetNamedQuery, ListNamedQueries, ListQueryExecutions | ||
²CreateNamedQuery, DeleteNamedQuery, GetNamedQuery | ||
³BatchGetQueryExecution | ||
⁴StartQueryExecution, StopQueryExecution | ||
⁵GetQueryExecution, GetQueryResults - `a value higher than 2 will exceed the max rate limit` | ||
⁶create table, create table add partition | ||
⁷select, create table as (CTAS) | ||
|
||
⁰max burst capacity not yet implemented | ||
|
||
## Important | ||
- Make sure to double % inside query files for other than parameters passed to the query or they will be replaced by sprintf | ||
|
||
Example passing year + month to constitute the table name: | ||
```sql | ||
SELECT DATE_FORMAT(FROM_UNIXTIME(1614716423), '%%Y-%%m-%%d %%H:%%i:%%S') | ||
FROM database.table_name_%1$s%2$s | ||
LIMIT 1 | ||
``` | ||
|
||
## The tools | ||
See tools [documentation](READMEs/README.tools.md) for more details. | ||
|
||
## Testing | ||
See tests [documentation](READMEs/README.tests.md) for more details. | ||
|
||
## AWS documentation | ||
AWS documentation: | ||
- AWS SDK [Basic Usage](https://docs.aws.amazon.com/sdk-for-php/v3/developer-guide/getting-started_basic-usage.html) | ||
- AWS SDK [API documentation for Athena](https://docs.aws.amazon.com/aws-sdk-php/v3/api/namespace-Aws.Athena.html) | ||
- AWS SDK for PHP v3 [Getting Started](https://docs.aws.amazon.com/sdk-for-php/v3/developer-guide/getting-started_index.html) | ||
- AWS Athena [Service Limits](https://docs.aws.amazon.com/athena/latest/ug/service-limits.html) | ||
- List of [AWS regions](http://docs.aws.amazon.com/general/latest/gr/rande.html) | ||
- Data [Partitioning](https://docs.aws.amazon.com/athena/latest/ug/partitions.html) | ||
|
||
## TODO | ||
Methods: | ||
- BatchGetNamedQuery | ||
- BatchGetQueryExecution | ||
- CreateDataCatalog | ||
- CreatePreparedStatement | ||
- CreateWorkGroup | ||
- DeleteDataCatalog | ||
- DeletePreparedStatement | ||
- DeleteWorkGroup | ||
- GetDataCatalog | ||
- GetPreparedStatement | ||
- GetQueryResults | ||
- GetWorkGroup | ||
- ListDataCatalogs | ||
- ListEngineVersions | ||
- ListPreparedStatements | ||
- ListQueryExecutions | ||
- ListTagsForResource | ||
- ListWorkGroups | ||
- TagResource | ||
- UntagResource | ||
- UpdateDataCatalog | ||
- UpdatePreparedStatement | ||
- UpdateWorkGroup | ||
|
||
Others: | ||
- implement burst capacity? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
## Tests | ||
This [test](../tests/test.sh) script allows to tests every tools of this library. | ||
|
||
Make sure to read **Requirements**, **Installation** and **Configuration** first. | ||
**For safety, a confirmation to delete data on s3 and drop database/tables is required at start.** | ||
|
||
Tested on Ubuntu 20.04 running PHP7.4. | ||
|
||
It requires the database `sampledb` and performs the following: | ||
1. list database `sampledb` | ||
2. create a new database | ||
3. create test data by extracting from `sampledb.elb_logs` table and creating tables with daily data | ||
4. create table for test data with multiple days data | ||
5. create day partitions on test data tables | ||
6. select data from multiple days table | ||
7. select several days data from single day table | ||
8. display query result files on s3 | ||
9. delete metadata files | ||
10. display query result files on s3 without metadata files | ||
11. detail tables in the database | ||
12. drop database and all tables | ||
13. delete data from s3 | ||
14. create named query | ||
15. detail named query | ||
16. delete named query | ||
|
||
Usage: | ||
```shell | ||
/bin/bash test.sh \ | ||
-d DATABASE_TO_CREATE \ | ||
-y YEAR_OF_DATA_TO_EXTRACT \ | ||
-m MONTH_OF_DATA_TO_EXTRACT | ||
``` | ||
|
||
Example: | ||
```shell | ||
/bin/bash test.sh \ | ||
-d aws_athena_api_tools_tests \ | ||
-y 2015 \ | ||
-m 01 | ||
``` | ||
|
||
Output: | ||
See expected [output](../tests/output.txt) for more details. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
## The tools | ||
**Create/Drop a database** | ||
See [usage](../tools/usage/database.usage.php) or | ||
``` | ||
php database.php -h/--help | ||
``` | ||
|
||
**Execute a single query** | ||
select | create table [as] | create view | create database | delete table ... | ||
|
||
See [usage](../tools/usage/query.usage.php) or | ||
``` | ||
php query.php -h/--help | ||
``` | ||
|
||
**Execute queries for each day in the given date range within the max rate limit** | ||
select | create table [as] | create view | create database | delete table ... | ||
|
||
Examples: [query-daily.sql](../examples/query-daily.sql) | ||
|
||
See [usage](../tools/usage/query-daily.usage.php) or | ||
``` | ||
php query-daily.php -h/--help | ||
``` | ||
|
||
**Execute queries for each month in the given date range within the max rate limit** | ||
select | create table [as] | create view | create database | delete table ... | ||
|
||
Examples: | ||
[create-table.sql](../examples/create-table.sql), | ||
[create-table-partitioned.sql](../examples/create-table-partitioned.sql), | ||
[create-table-partitioned-hive.sql](../examples/create-table-partitioned-hive.sql), | ||
[drop-table.sql](../examples/drop-table.sql), | ||
[query-monthly.sql](../examples/query-monthly.sql) | ||
|
||
See [usage](../tools/usage/query-monthly.usage.php) or | ||
``` | ||
php query-monthly.php -h/--help | ||
``` | ||
|
||
**Create day partitions on a table (non-Hive formatted data)** | ||
See [usage](../tools/usage/partitions-daily.usage.php) or | ||
``` | ||
php partitions-daily.php -h/--help | ||
``` | ||
|
||
**Create day partitions on a table (Hive formatted data)** | ||
See [usage](../tools/usage/partitions-daily-hive.usage.php) or | ||
``` | ||
php partitions-daily-hive.php -h/--help | ||
``` | ||
|
||
**Get the execution state of a query (running, failed, succeeded, ...)** | ||
See [usage](../tools/usage/state.usage.php) or | ||
``` | ||
php state.php -h/--help | ||
``` | ||
|
||
**Get the execution state of queries listed in a file (running, failed, succeeded, ...)** | ||
Example: [query-ids-list.txt](../examples/query-ids-list.txt) | ||
|
||
See [usage](../tools/usage/state-from-list.usage.php) or | ||
``` | ||
php state-from-list.php -h/--help | ||
``` | ||
|
||
**Stop a running query** | ||
See [usage](../tools/usage/stop.usage.php) or | ||
``` | ||
php stop.php -h/--help | ||
``` | ||
|
||
**Delete metadata files recursively from an S3 location (bucket/prefixes)** | ||
See [usage](../tools/usage/delete-metadata-files.usage.sh) or | ||
``` | ||
delete-metadata-files.sh -h/--help | ||
``` | ||
|
||
**Create or delete a named query** | ||
See [usage](../tools/usage/named-query.usage.php) or | ||
``` | ||
php named-query.php -h/--help | ||
``` | ||
|
||
**Detail one or all named queries and output to json format** | ||
See [usage](../tools/usage/list-named-queries.usage.php) or | ||
``` | ||
php list-named-queries.php -h/--help | ||
``` | ||
|
||
**Detail one or all databases and output to json format** | ||
See [usage](../tools/usage/list-databases.usage.php) or | ||
``` | ||
php list-databases.php -h/--help | ||
``` | ||
|
||
**Detail one or all tables of a database and output to json format** | ||
See [usage](../tools/usage/list-tables.usage.php) or | ||
``` | ||
php list-tables.php -h/--help | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
{ | ||
"name": "francoischaumont/aws-athena-api-tools", | ||
"description": "Toolkit for AWS Athena using AWS SDK for PHP v3", | ||
"authors": [ | ||
{ | ||
"name": "Francois Chaumont", | ||
"role": "main developer" | ||
} | ||
], | ||
"require": { | ||
"php": "^7.4", | ||
"aws/aws-sdk-php": "^3.175", | ||
"vlucas/phpdotenv": "^5.3" | ||
}, | ||
"autoload": { | ||
"psr-4": { | ||
"FC\\AWS\\": "src/AWS/" | ||
} | ||
} | ||
} |
Oops, something went wrong.