-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: Update Airflow Airbyte Provider to use the new Airbyte API #37244
Conversation
…ng away from the unsupported Configuration API
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
Is there official statement of Airbyte saying that? I couldn't find such statement in the SDK GitHub nor in PyPi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR introduce breaking changes
Note
Existed users might use pretty old OSS (last year release) version of Airbyte which only support Configuration API
Option 1
If we would like to introduce Breaking Changes then we need to create notes into the Change log and potentially simple migration guide, e.g. minimal version of Airbyte which support Official API
Option 2
In the other way round if we would like to keep both support for a while we need to create separate Hooks, Operators and etc (simple copy-paste) which communicate with Airbyte API only. In addition we might add UserWarning / FutureWarning into the old Hook, which raise a warning that end users use old API, which discontinued soon.
According to the recent changes into the Airbyte provider: #36780 the users still use Configuration API so if I were you I would choose Option 2
|
@Taragolis - thanks for your comment. What do you think about asking people to downgrade their Airflow Airbyte provider in the case that someone couldn't upgrade their Airbyte deployment? - if we went for option 1 that is. It'd keep this code base clean; without copy/pasted code. I'm happy if this PR is left open for a while - perhaps get a few more opinions. |
Sorry for delay response.
In general it is an option, however always unknown how many users read changelogs and not blindly install latest version of provider, or just use version bundled into the Official Apache Airflow Docker image The problem here that currently provider supports only Configurator API and that is mean that users of this provider only use it as result it might affect almost 100% current users. I'm not sure how many users would be affected, because we do not have any statistic rather than package how often package downloaded: PyPI Stats, PePy We always tried to deprecate something first, so end users have a chance to smooth transition to the new API. In addition we have no idea how many users extend AirbyteHook as well as related operators, so it might take a time to them to migrate. |
If it was a simple breaking change in parameters that users could correct in their DAGs. I'd be for immediate breaking change. But the fact that users might be using an older version of airbyte OSS makes a big difference, because that might require way more effort to upgrade. As I see it - this change is not breaking for DAG Authoring but breaking for - potentially - deployment. Do we know how old version of the OSS Airbyte does not support the new API ? How old it is? Any stats of usage that Airbyte might have? I think we should assess it based on likelihood of breaking multiple users workflows. If the OSS version of Airbyte that is incompatible is like 2 years old, for example, the likelihood is small. Then we could capture the error that happens in this case and explicitly ask users to downgrade. To @Taragolis comment - as long as we raise a very explicitt error and clearly communicate the user "Your Airbyte OSS version does not support the new API this provider supports, please downgrade porovider to version x.y.z" - or something like that - then option 1 is acceptable. But if someone's OSS fresh version installed 4 months ago will stop working then option 2 is better. |
Just for the record, I've checked latest Airbyte Configuration API ❯ curl -X POST "http://127.0.0.1:8000/api/v1/workspaces/list" \
--header 'accept: application/json' \
--user airbyte:password
{"workspaces":[{"workspaceId":"c81b0f84-e935-4ca4-974f-b1dbc8e4af58","customerId":"3aa7d861-9af9-4708-86ea-cd089605ef6e","email":"example@org.org","name":"c81b0f84-e935-4ca4-974f-b1dbc8e4af58","slug":"c81b0f84-e935-4ca4-974f-b1dbc8e4af58","initialSetupComplete":true,"displaySetupWizard":false,"anonymousDataCollection":false,"notifications":[],"notificationSettings":{},"defaultGeography":"auto","organizationId":"00000000-0000-0000-0000-000000000000"}]} Airbyte API ❯ curl -X GET "http://127.0.0.1:8006/v1/workspaces?includeDeleted=false&limit=20&offset=0" \
--header 'accept: application/json' \
--user airbyte:password
{"data":[{"workspaceId":"c81b0f84-e935-4ca4-974f-b1dbc8e4af58","name":"c81b0f84-e935-4ca4-974f-b1dbc8e4af58","dataResidency":"auto"}]} |
so it's ~ the threshold I had in mind. I'd say it's a bit too early to remove it then and supporting both should be preferred |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
closes: #35011
Updated the Airbyte provider to use the new Airbyte API. This update ensures continued functionality as Airbyte intends to discontinue the Configuration API, which this provider perviously used to interact with Airbyte.
The discussion in issue #35011 mentioned two possible considerations when implementing a fix:
[Airbyte's API Python SDK]: The changes in this PR do not use the official Python SDK. This SDK is still in beta. Furthermore, I wanted to keep the changes lightweight and not add another dependency, especially considering the Airbyte Airflow provider only uses 4 of Airbyte's API endpoints.
[Dual API Support]:The change removes support for the Configuration API. It would have been cumbersome to support both API's. Those who are running an older version of Airbyte can install an older version of this provider when setting up Airflow.
Also worth noting, I have tested the changes in this PR on my production Airflow deployment to verify the code works.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.