-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] Decouple OAuth functionality from Config
#784
Conversation
This PR breaks backwards compatibility for databrickslabs/blueprint downstream. See build logs for more details. Running from downstreams #551 |
This PR breaks backwards compatibility for databrickslabs/ucx downstream. See build logs for more details. Running from downstreams #551 |
## Changes `ApiClient` is also coupled to the `Config` object, which means that it can't be used in situations where there is no config. For example, when fetching OIDC endpoints, the user may not have a complete `Config` instance yet. However, failures when requesting from those endpoints should still be retried according to the SDK's retry policy. To address this, I've split the ApiClient into `_BaseClient` and `ApiClient`. `_BaseClient` is the core implementation of the client without any dependency on the `Config`. This is similar to what @rauchy did in the Java SDK to cut the dependency between the `ApiClient` and `DatabricksConfig`. The `_BaseClient` can then be used when fetching OIDC endpoint information. This will be used in #784 to support retrying OAuth OIDC endpoint fetches. ## Tests <!-- How is this tested? Please see the checklist below and also describe any other relevant tests --> - [ ] `make test` run locally - [ ] `make fmt` applied - [ ] relevant integration tests applied
This changes public classes/methods and introduces backward incompatible changes. |
I'll list the breaking changes here and we can discuss whether the mitigation is sufficient:
Otherwise, it's a confusing pattern for the token to contain a reference to the client which refreshes it. That's part of the reason for the spaghetti architecture where parameters like client ID and token endpoint need to be passed through many layers of classes to be used. I would rather not try to fix these things in this PR (and probably should save them for the next SDK version). We can include the following in the changelog for this PR in the release: Breaking Changes
|
### Breaking Changes * `external_browser` now uses the `databricks-cli` app instead of the third-party "6128a518-99a9-425b-8333-4cc94f04cacd" application when performing the U2M login flow for Azure workspaces when a client ID is not otherwise specified. This matches the AWS behavior. * The signatures of several OAuth-related constructors have changed to support U2M OAuth with Azure Entra ID application registrations. See https://github.com/databricks/databricks-sdk-py/blob/main/examples/flask_app_with_oauth.py for examples of how to use these classes. * `OAuthClient()`: renamed to `OAuthClient.from_host()` * `SessionCredentials()` and `SessionCredentials.from_dict()`: now accepts `token_endpoint`, `client_id`, `client_secret`, and `refresh_url` as parameters, rather than accepting the `OAuthClient`. * `TokenCache()`: now accepts `host`, `token_endpoint`, `client_id`, `client_secret`, and `refresh_url` as parameters, rather than accepting the `OAuthClient`. ### Bug Fixes * Decouple OAuth functionality from `Config` ([#784](#784)). ### Release * Release v0.35.0 ([#793](#793)).
### Breaking Changes * `external_browser` now uses the `databricks-cli` app instead of the third-party "6128a518-99a9-425b-8333-4cc94f04cacd" application when performing the U2M login flow for Azure workspaces when a client ID is not otherwise specified. This matches the AWS behavior. * The signatures of several OAuth-related constructors have changed to support U2M OAuth with Azure Entra ID application registrations. See https://github.com/databricks/databricks-sdk-py/blob/main/examples/flask_app_with_oauth.py for examples of how to use these classes. * `OAuthClient()`: renamed to `OAuthClient.from_host()` * `SessionCredentials()` and `SessionCredentials.from_dict()`: now accepts `token_endpoint`, `client_id`, `client_secret`, and `refresh_url` as parameters, rather than accepting the `OAuthClient`. * `TokenCache()`: now accepts `host`, `token_endpoint`, `client_id`, `client_secret`, and `refresh_url` as parameters, rather than accepting the `OAuthClient`. ### Bug Fixes * Decouple OAuth functionality from `Config` ([#784](#784)). ### Release * Release v0.35.0 ([#793](#793)). Co-authored-by: Omer Lachish <rauchy@users.noreply.github.com>
Changes
OAuth Refactoring
Currently, OAuthClient uses Config internally to resolve the OIDC endpoints by passing the client ID and host to an internal Config instance and calling its
oidc_endpoints
method. This has a few drawbacks:Config
depends on methods inoauth.py
, andOAuthClient
depends onConfig
. This currently doesn't break because theConfig
import is done at runtime in theOAuthClient
constructor.Rather than having these classes deeply coupled to one another, we can allow users to fetch the OIDC endpoints for a given account/workspace as a top-level functionality and provide this to
OAuthClient
. This breaks the cyclic dependency and doesn't requireOAuthClient
to expose any unnecessary parameters.Further, I've also tried to remove the coupling of the other classes in
oauth.py
toOAuthClient
. Currently,OAuthClient
serves both as the mechanism to initialize OAuth and as a kind of configuration object, capturing OAuth endpoint URLs, client ID/secret, redirect URL, and scopes. Now, the parameters for each of these classes are explicit, removing all unnecessarily coupling between them. One nice advantage is that the Consent can be serialized/deserialized without any reference to theOAuthClient
anymore.There is definitely more work to be done to simplify and clean up the OAuth implementation, but this should at least unblock users who need to use Azure Entra ID U2M OAuth in the SDK.
Tests
The new OIDC endpoint methods are tested, and those tests also verify that those endpoints are retried in case of rate limiting.
I ran the flask app example against an AWS workspace, and I ran the external-browser demo example against AWS, Azure and GCP workspaces with the default client ID and with a newly created OAuth app with and without credentials.
make test
run locallymake fmt
applied