You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using tags on resource in AWS. When setting tags when using GlueCrawlerOperator it works the first time, when Airflow creates the crawler. However on subsequent runs in fails because boto3.get_crawler() does not return the Tags. Hence we get the error below.
[2022-11-08, 14:48:49 ] {taskinstance.py:1774} ERROR - Task failed with exception
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/operators/glue_crawler.py", line 80, in execute
self.hook.update_crawler(**self.config)
File "/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/glue_crawler.py", line 86, in update_crawler
key: value for key, value in crawler_kwargs.items() if current_crawler[key] != crawler_kwargs[key]
File "/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/glue_crawler.py", line 86, in <dictcomp>
key: value for key, value in crawler_kwargs.items() if current_crawler[key] != crawler_kwargs[key]
KeyError: 'Tags'
What you think should happen instead
Ignore tags when checking if the crawler should be updated.
How to reproduce
Use GlueCrawlerOperator with Tags like below and trigger the task multiple times. It will fail the second time around.
Apache Airflow version
Other Airflow 2 version (please specify below)
What happened
We are using tags on resource in AWS. When setting tags when using
GlueCrawlerOperator
it works the first time, when Airflow creates the crawler. However on subsequent runs in fails becauseboto3.get_crawler()
does not return the Tags. Hence we get the error below.What you think should happen instead
Ignore tags when checking if the crawler should be updated.
How to reproduce
Use
GlueCrawlerOperator
with Tags like below and trigger the task multiple times. It will fail the second time around.Operating System
Ubuntu
Versions of Apache Airflow Providers
Deployment
Other 3rd-party Helm chart
Deployment details
Airflow v2.2.5
Self-hosted Airflow in Kubernetes.
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: