You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
Upon providing valid arguments, the following error appeared:
[2021-06-12 16:31:46,277] {base_aws.py:395} INFO - Creating session using boto3 credential strategy region_name=None
[2021-06-12 16:31:47,339] {taskinstance.py:1481} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1137, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1311, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1341, in _execute_task
result = task_copy.execute(context=context)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/operators/glue.py", line 106, in execute
s3_hook.load_file(self.script_location, self.s3_bucket, self.s3_artifacts_prefix + script_name)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 62, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 91, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 499, in load_file
if not replace and self.check_for_key(key, bucket_name):
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 62, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 91, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 323, in check_for_key
self.get_conn().head_object(Bucket=bucket_name, Key=key)
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/client.py", line 648, in _make_api_call
request_dict = self._convert_to_request_dict(
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/client.py", line 694, in _convert_to_request_dict
api_params = self._emit_api_params(
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/client.py", line 723, in _emit_api_params
self.meta.events.emit(
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/hooks.py", line 356, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/hooks.py", line 228, in emit
return self._emit(event_name, kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/hooks.py", line 211, in _emit
response = handler(**kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/handlers.py", line 236, in validate_bucket_name
raise ParamValidationError(report=error_msg)
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid bucket name "artifacts/glue-scripts/example.py": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"
[2021-06-12 16:31:47,341] {taskinstance.py:1524} INFO - Marking task as UP_FOR_RETRY. dag_id=glue-example, task_id=example_glue_job_operator, execution_date=20210612T163143, start_date=20210612T163145, end_date=20210612T163147
[2021-06-12 16:31:47,386] {local_task_job.py:151} INFO - Task exited with return code 1
What you expected to happen: To succeed uploading the script. To be able to replace existing script in s3
How to reproduce it:
Try to upload the file to any S3 buckets
t2 = AwsGlueJobOperator(
task_id="example_glue_job_operator",
job_desc="Example Airflow Glue job",
# Note the operator will upload the script if it is not an s3:// reference
# See https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/operators/glue.py#L101
script_location="/opt/airflow/dags_lib/example.py",
concurrent_run_limit=1,
script_args={},
num_of_dpus=1, # This parameter is deprecated (from boto3). Use MaxCapacity instead on kwargs.
aws_conn_id='aws_default',
region_name="aws-region",
s3_bucket="bucket-name",
iam_role_name="iam_role_name_here",
create_job_kwargs={}
)
Anything else we need to know:
How often does this problem occur? Every time id using local script
Apache Airflow version: 2.1.0
Kubernetes version (if you are using kubernetes) (use
kubectl version
): NAEnvironment: bare metal k8s in AWS EC2
uname -a
): Linux airflow-web-749866f579-ns9rk 5.4.0-1048-aws EmailOperator is not working on Mac with Postfix : expecting TLS #50-Ubuntu SMP Mon May 3 21:44:17 UTC 2021 x86_64 GNU/LinuxWhat happened:
Upon providing valid arguments, the following error appeared:
Upon looking at the order of arguments, seems like the 2nd and 3rd are reversed. Furthermore, the operator does not expose the replace option if desired, which is vary valuable.
Note key and bucket name are passed by position and not reference https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/operators/glue.py#L104
and they are reversed https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/hooks/s3.py#L466
What you expected to happen: To succeed uploading the script. To be able to replace existing script in s3
How to reproduce it:
Try to upload the file to any S3 buckets
Anything else we need to know:
How often does this problem occur? Every time id using local script
I can take a stub at fixing it. I did notice the operator does not allow to update a glue job definition after its creation. boto3 offers an api to do so but it is not exposed in this operator https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html#Glue.Client.update_job It would be great if I could add that as well, but might fall out of scope
The text was updated successfully, but these errors were encountered: