-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GlueCrawlerOperature failure when using tags #28005
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
4f627f0
to
26fde28
Compare
26fde28
to
6fc5d39
Compare
Sorry for the force-push spam, I was sorting out the GPG signature thingy. |
@IAL32 I also would be nice if we have some tests for this scenario
|
@Taragolis I have added several tests, and isolated tag updating by adding a In my solution, I call |
Changing tags however requires to know the crawler's ARN, which is only possible to obtain via concatenating the current account ID, the region name and the crawler's name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your idea just some suggestions and nitpicks
@cached_property | ||
def sts_hook(self): | ||
return StsHook(aws_conn_id=self.aws_conn_id) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, better not to add one hook property to another if it possible or at least make it "private"
Looks like we only need to call it once in GlueCrawlerHook, so we could directly call in update_tags
method
account_id = StsHook(aws_conn_id=self.aws_conn_id).get_account_number()
As long term (separate PR) it is a good idea to create account_id
property in AwsBaseHook
.
@Taragolis I updated the code following your suggestions 😄 thanks! |
Fixes needed :( |
@potiuk resolved conflicts and test |
Awesome work, congrats on your first merged pull request! |
Closes #27556.
Previous implementation assumes that the current crawler data always has the keys in the wanted configuration, but this is not true, causing a KeyError when the key is not present in the current crawler configuration.
This PR fixes this by using the more stable
dict.get
method, providing a default value of None to allow proper comparison.