-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update SDKs for google provider package #30067
Conversation
Got referred to here from #27292 . I eventually get stuck with apache-airflow-providers-google depending on google-cloud-secret-manager < 2.x . Which depend on protobuf 3 which causes my predicament. Higher versions depend on protobuf > 4.5.x . Is there progress on this ticket? Just wondering but keep up the good work |
@felicienveldema We are still working on the changes in the google-provider package, but still we have some problems with the dependencies in 3 packages because they are still depends on protobuf<4.0 and probably at May we will have updates. So I think the will have new google-provider version supporting latest version of SDKs at the end of May or later. |
I got curious about the 3 listed packages above causing issues. One of them |
I have not looked at it yet. Do you have some ideas ? |
Generally the options are:
So we have a number of options we can follow. |
I just opened the issue to: yandex-cloud: yandex-cloud/python-sdk#71 and I will prepare support for disabling providers and excluding them if they are holding us back (cc: @eladkal). I will also raise this to our devlist. |
Devlist discussion started: https://lists.apache.org/thread/j98bgw9jo7xr4fvjh27d6bfoyxr1omcm (especially CC: @eladkal especially) I am curious what you think. |
FYI: We have no problem with apache-beam: apache/beam#24599 - 2 weeks ago they marged protobuf bump, so we just neeed to wait for the next release |
I also asked Oracle/MySQL in their forums (the only way we can do it) https://forums.mysql.com/read.php?50,708413 and see what they say. But I am also all for disabling mysql provider if they don't respond. |
We have first reaction: yandex-cloud/python-sdk#71 (comment) |
I'm also 100% for this (but I've got some bias in that I always use Postgres over mysql). I hesitated to suggest that since I was unsure if that would impact offering mysql as one of Airflow Meta DB backends. |
I have to check but I think this has actually nothing to do with mysql metadata backend. For that we are using sqlalchemy and it has a few drivers it can choose from. And I think our driver for CI/tests is BTW. This is another possibilty to rewrite the hooks to use mysqlclient. I might take a look at that actually. |
Hi, is it too risky for Airflow to just update from Today Latest compatibility to a new API version was added in: https://github.com/googleads/google-ads-python/pull/672/files#diff-91c5b46dc84a94604a4e4d0caed9bf85590a2eddbb12d2e8dc80badf324a9dfbR9 (
|
We're in the dark night now. Sunset has passed 😅 We're now getting error: |
Hi everyone, I'm wondering, because after removing the |
I have not thought about it yet. I am waiting for the response of Oracle (If it comes) for a week - according to our new policy that's being "lazy consent now" and then I will take a closer look at that after. There is also an option to tunr mysql-connector-python into ACTUALLY optional feature (which I think is the best option) - so make it an extra (we already have a few of those). In this case we should leave it. |
@cgadam: It is likely we might have a proposal how to solve it soon - would you be willing to test it if I give you access to a beta/pre-release of google provider that you could test with it with an implemented worakround (with an intention of making it into next release?) |
Changes: - update train model that is used for prediction - update version and runner for ApacheBeam in utils for MLEngine - update connection inside async hook
Changes: - fix tests/system/providers/google/cloud/dataprep/example_dataprep.py
- Secret Manager was missing updating to v2, now expects a request dict - Compute ssh had a bug when no cmd_timeout was passed - Cloud Build tests were improved/refactored in community, so deleting old ones - googleapiclient.errors.HttpError was incorrectly used in our tests, it it didn´t matter before but a change in the class makes HttpError() raise an error in initialization the way we were using it before - fix static checks ``` $ pytest tests/providers/google/cloud/ ... ===== 2763 passed, 71 skipped, 21 warnings in 193.46s (0:03:13) ===== ```
👀 👀 👀 👀 |
e6bb498
to
31d7cd6
Compare
I wanted to say thanks for all this work and I've been tracking it from a distance. I'm looking forward to the updated Dataproc libs for further enhancements to the Dataproc serverless operator. |
Thanks in the name of all the people who worked on that (I was also just helping) - it's rare to get an unsolicited positive feedback and a thank you note. So rare :). |
Those are intermittent errors only (I need to make them more stable). Merging |
🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 |
CC: @ephraimbuddy -> I just realized we will need it - I marked this one also for 2.6.2. While the "code" changes aren't used in the release from 2.6.2, the "dependency" part (provider.yaml and generated/provider_dependencies.json) will be needed to properly build CI once we release the new google provider with all its deps |
I'm hoping that we will release 2.6.2 as followup right after provider wave is released. |
* Update SDK versions for Google provider * Adjust google ads operators to v12 Changes: - fix tests/system/providers/google/cloud/bigquery/example_bigquery_queries.py - fix tests/system/providers/google/cloud/bigquery/example_bigquery_queries_async.py * Fix GCS system tests * Fix CloudBuild unit test * Update BigTable operators to accomodate for new dependencies. * Fix Cloud Tasks System tests Tasks dag was quite flaky without the retry option in the run_task step, but it's consistently green with the option set. We also add a GCP_APP_ENGINE_LOCATION env variable since this depends on the used GCP Project App Engine's location * Add setup docstring to Tasks system tests. * Update Vision operators to accommodate new dependencies. Changes: - fix methods for CloudVisionHook - fix Vision Operators - fix tests/providers/google/cloud/hooks/test_vision.py - fix tests/providers/google/cloud/operators/test_vision.py - fix tests/system/providers/google/cloud/vision/example_vision_annotate_image.py - fix tests/system/providers/google/cloud/vision/example_vision_autogenerated.py - fix tests/system/providers/google/cloud/vision/example_vision_explicit.py * Update SpeechToText operators to accommodate new dependencies. Changes: - fix synthesize_speech method for CloudTextToSpeechHook - fix CloudSpeechToTextRecognizeSpeechOperator - fix tests/providers/google/cloud/operators/test_speech_to_text.py - fix tests/providers/google/cloud/hooks/test_text_to_speech.py - fix tests/providers/google/cloud/hooks/test_speech_to_text.py * Update Translate Speech operators to accommodate new dependencies. Changes: - fix synthesize_speech method for CloudTextToSpeechHook - fix CloudTranslateSpeechOperator - tests/providers/google/cloud/operators/test_translate_speech.py * Update VideoIntelligence operators to accommodate new dependencies. Changes: - fix annotate_video method for CloudVideoIntelligenceHook - fix VideoIntelligence Operators - fix tests/providers/google/cloud/hooks/test_video_intelligence.py - fix tests/providers/google/cloud/operators/test_video_intelligence.py * Update Compute Engine operators to accomodate for new dependencies. Changes: - added wait_for_operation_complete() method to check the execution flow - added new attribute cmd_timeout for ComputeEngineSSHHook * Fix Stackdriver system test This test has not worked because of slack channel and credentials not being setup. We now test the same operators by creating notification channels and policy alerts against pubsub topics, which don't need to exist before the test is ran, making the test self-contained. * Update Natural Language operators to accommodate new dependencies. Changes: - fix airflow/providers/google/cloud/operators/natural_language.py - fix airflow/providers/google/cloud/hooks/natural_language.py - fix tests/providers/google/cloud/hooks/test_natural_language.py - fix tests/providers/google/cloud/operators/test_natural_language.py - fix tests/system/providers/google/cloud/natural_language/example_natural_language.py * Update Composer system tests. Fix environment id to contain underscores. * Update AutoML operators to accommodate new dependencies. Changes: - add timeout parameter to all long-running operations for operators - fix tests/system/providers/google/cloud/automl/example_automl_dataset.py - fix tests/system/providers/google/cloud/automl/example_automl_model.py - fix tests/system/providers/google/cloud/automl/example_automl_nl_text_extraction.py - fix tests/system/providers/google/cloud/automl/example_automl_vision_classification.py * Fix Cloud SQL delete operator For some delete instance operations, the operation stops being available ~9 seconds after completion, so we need a shorter sleep time to make sure we don'tmiss the DONE status. * Update VertexAI operators to accommodate new dependencies. * Add SQL to Sheets Test instructions * Update Dataproc Metastore operators to accommodate new dependencies. * Update Dataproc operators to accommodate new dependencies. * Update Dataflow sys tests to new sdk * Update Dataproc on gke operators to accommodate new dependencies. * Update MLEngine operators to accomodate new dependencies. Changes: - update train model that is used for prediction - update version and runner for ApacheBeam in utils for MLEngine - update connection inside async hook * Update Dataprep operators to accommodate new dependencies. Changes: - fix tests/system/providers/google/cloud/dataprep/example_dataprep.py * Add Dataflow Go system test * Update providers.yaml for google * fixup! Update providers.yaml for google * Google SDK Fixes after rebase - Secret Manager was missing updating to v2, now expects a request dict - Compute ssh had a bug when no cmd_timeout was passed - Cloud Build tests were improved/refactored in community, so deleting old ones - googleapiclient.errors.HttpError was incorrectly used in our tests, it it didn´t matter before but a change in the class makes HttpError() raise an error in initialization the way we were using it before - fix static checks * Fix Google providers type errors --------- Co-authored-by: Lukasz Wyszomirski <wyszomirski@google.com> Co-authored-by: Maksim Moiseenkov <maksim_moiseenkov@epam.com> Co-authored-by: Eugene Kostieiev <kosteev@google.com> Co-authored-by: Augusto Hidalgo <augustoh@google.com> Co-authored-by: Beata Kossakowska <bkossakowska@google.com> Co-authored-by: Ulada Zakharava <uladaz@google.com> Co-authored-by: Jarek Potiuk <jarek@potiuk.com> (cherry picked from commit 28d1bf8)
As everyone know google provider package have a lot of old dependencies. I would like to start migration to the latest versions of the SDK. For now we are blocked by some other dependencies because they are using
protobuf<4
.Also in the google SDKs we had a lot of breaking changes so after updating we need to adjust broken operators. I did investigation how big is this problem and I'm attaching the list of services where some of the operators are broken:
Fixes: #27292
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.