Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DECO-2485] Handle Azure authentication when WorkspaceResourceID is provided #328

Merged
merged 4 commits into from
Sep 11, 2023

Conversation

hectorcast-db
Copy link
Contributor

@hectorcast-db hectorcast-db commented Aug 31, 2023

Changes

Handle Azure authentication when WorkspaceResourceID is provided

Get token for the correct subscription

Tests

  • Created Unit tests
  • Manually listed workspace cluster in the following scenarios:
    • User with wrong default tenant. No WorkspaceResourceID provided: Fail (expected). WARN log emitted.
    • User with wrong default tenant. WorkspaceResourceID provided: Succeed
    • User with no subscription. No WorkspaceResourceID provided: Succeed. WARN log emitted.
    • User with no subscription. WorkspaceResourceID provided: Succeed (fallback mode, expected).

@hectorcast-db hectorcast-db requested a review from mgyucht August 31, 2023 14:52
nfx
nfx previously requested changes Sep 1, 2023
@@ -292,6 +295,37 @@ def inner() -> Dict[str, str]:
return inner


def get_token(cfg: 'Config', resource: str) -> AzureCliTokenSource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't this validation be moved to AzureCliTokenSource constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean adding a try/except block around the "super().init" code. Not being that familiar with Python, and coming from languages like Java which don't allow you to do that, it seemed like a bad idea to me. But I can add it if it is considered acceptable practice in Python.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you then try adding validate (or validate_subscription, or something like that) method on AzureCliTokenSource and call it instead of the first call to token.

    try:
        token_source.token()
    except FileNotFoundError:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, the IncorrectClaimException in the original ticket does refer to Azure Tenant (identities), not the Azure Subscription (resources). there are valid cases, when user has no access to subscription, but has access to a workspace. also, azure_workspace_resource_id is not required for Azure CLI auth type.

to solve IncorrectClaimException we have to supply azure_login_app_id. I'll reach to you on slack.

# In such case, we fall back to not using the subscription.
token.token()
return token
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't catch broad exceptions, catch concrete ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

return AzureCliTokenSource(resource)


def get_subscription(cfg: 'Config') -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should move as a private method to AzureCliTokenSource as well

@codecov-commenter
Copy link

codecov-commenter commented Sep 4, 2023

Codecov Report

Patch coverage is 84.84% of modified lines.

Files Changed Coverage
databricks/sdk/core.py 84.84%

📢 Thoughts on this report? Let us know!.

Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, need a couple changes to error handling, and the new functions should be internal to the AzureCliTokenSource class. Once these are addressed, LGTM.

@@ -292,6 +295,37 @@ def inner() -> Dict[str, str]:
return inner


def get_token(cfg: 'Config', resource: str) -> AzureCliTokenSource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staticmethod
def for_resource(cfg: 'Config', resource: str) -> 'AzureCliTokenSource':
    ...

AzureCliTokenSource.for_resource(...)

@@ -264,8 +267,8 @@ def __init__(self, resource: str):
@credentials_provider('azure-cli', ['is_azure'])
def azure_cli(cfg: 'Config') -> Optional[HeaderFactory]:
""" Adds refreshed OAuth token granted by `az login` command to every request. """
token_source = AzureCliTokenSource(cfg.effective_azure_login_app_id)
mgmt_token_source = AzureCliTokenSource(cfg.arm_environment.service_management_endpoint)
token_source = get_token(cfg, cfg.effective_azure_login_app_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move error-handling logic to for_resource

Copy link
Contributor

@tanmay-db tanmay-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of nits otherwise LGTM.

Comment on lines +259 to +260
cmd.append("--subscription")
cmd.append(subscription)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can use extend if we are adding more than one item. It would be more readable.

Comment on lines +273 to +274
# itself.
# In such case, we fall back to not using the subscription.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the bottom comment can be moved above.

def _get_subscription(cfg: 'Config') -> str:
resource = cfg.azure_workspace_resource_id
if resource == None or resource == "":
return ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to log something along the lines of resource not found.

@hectorcast-db hectorcast-db requested a review from nfx September 8, 2023 08:32
@hectorcast-db hectorcast-db dismissed nfx’s stale review September 11, 2023 09:54

Applied relevant changes. PR approved by team.

@hectorcast-db hectorcast-db added this pull request to the merge queue Sep 11, 2023
Merged via the queue into main with commit 986d1d9 Sep 11, 2023
@hectorcast-db hectorcast-db deleted the DECO-2485 branch September 11, 2023 09:56
tanmay-db added a commit that referenced this pull request Sep 20, 2023
* Don't try to import runtime_auth when not in runtime ([#327](#327)).
* [DECO-2485] Handle Azure authentication when WorkspaceResourceID is provided ([#328](#328)).
* Add ErrorInfo to API errors ([#347](#347)).
* Fix eager default argument evaluation in `DatabricksError` ([#353](#353)).
* Fixed code generation of primitive types ([#354](#354)).
* Updated SDK to changes in OpenAPI specification ([#355](#355)).

API Changes:

 * Changed `list()` method for [a.account_metastore_assignments](https://databricks-sdk-py.readthedocs.io/en/latest/account/account_metastore_assignments.html) account-level service to return `databricks.sdk.service.catalog.WorkspaceIdList` dataclass.
 * Changed `artifact_matchers` field for `databricks.sdk.service.catalog.ArtifactAllowlistInfo` to `databricks.sdk.service.catalog.ArtifactMatcherList` dataclass.
 * Changed `artifact_matchers` field for `databricks.sdk.service.catalog.SetArtifactAllowlist` to `databricks.sdk.service.catalog.ArtifactMatcherList` dataclass.
 * Added `databricks.sdk.service.catalog.WorkspaceId` dataclass.
 * Changed `cancel_all_runs()` method for [w.jobs](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/jobs.html) workspace-level service with new required argument order.
 * Changed `job_id` field for `databricks.sdk.service.jobs.CancelAllRuns` to no longer be required.
 * Added `all_queued_runs` field for `databricks.sdk.service.jobs.CancelAllRuns`.
 * Added `queue` field for `databricks.sdk.service.jobs.CreateJob`.
 * Added `queue` field for `databricks.sdk.service.jobs.JobSettings`.
 * Added `queue` field for `databricks.sdk.service.jobs.RunNow`.
 * Added `queue_reason` field for `databricks.sdk.service.jobs.RunState`.
 * Added `queue_duration` field for `databricks.sdk.service.jobs.RunTask`.
 * Added `queue` field for `databricks.sdk.service.jobs.SubmitRun`.
 * Added `databricks.sdk.service.jobs.QueueSettings` dataclass.
 * Added [a.o_auth_published_apps](https://databricks-sdk-py.readthedocs.io/en/latest/account/o_auth_published_apps.html) account-level service.
 * Added `databricks.sdk.service.oauth2.GetPublishedAppsOutput` dataclass.
 * Added `databricks.sdk.service.oauth2.ListOAuthPublishedAppsRequest` dataclass.
 * Added `databricks.sdk.service.oauth2.PublishedAppOutput` dataclass.
 * Added `patch()` method for [w.serving_endpoints](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/serving_endpoints.html) workspace-level service.
 * Added `tags` field for `databricks.sdk.service.serving.CreateServingEndpoint`.
 * Added `tags` field for `databricks.sdk.service.serving.ServingEndpoint`.
 * Added `tags` field for `databricks.sdk.service.serving.ServingEndpointDetailed`.
 * Added `databricks.sdk.service.serving.EndpointTag` dataclass.
 * Added `databricks.sdk.service.serving.PatchServingEndpointTags` dataclass.
 * Added [w.credentials_manager](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/credentials_manager.html) workspace-level service.
 * Added `databricks.sdk.service.settings.ExchangeToken` dataclass.
 * Added `databricks.sdk.service.settings.ExchangeTokenRequest` dataclass.
 * Added `databricks.sdk.service.settings.ExchangeTokenResponse` dataclass.
 * Added `databricks.sdk.service.settings.PartitionId` dataclass.
 * Added `databricks.sdk.service.settings.TokenType` dataclass.
 * Changed `execute_statement()` method for [w.statement_execution](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/statement_execution.html) workspace-level service with new required argument order.
 * Added `empty_result_state` field for `databricks.sdk.service.sql.AlertOptions`.
 * Removed `databricks.sdk.service.sql.ChunkInfo` dataclass.
 * Changed `on_wait_timeout` field for `databricks.sdk.service.sql.ExecuteStatementRequest` to `databricks.sdk.service.sql.ExecuteStatementRequestOnWaitTimeout` dataclass.
 * Changed `statement` field for `databricks.sdk.service.sql.ExecuteStatementRequest` to be required.
 * Changed `warehouse_id` field for `databricks.sdk.service.sql.ExecuteStatementRequest` to be required.
 * Changed `chunks` field for `databricks.sdk.service.sql.ResultManifest` to `databricks.sdk.service.sql.BaseChunkInfoList` dataclass.
 * Added `truncated` field for `databricks.sdk.service.sql.ResultManifest`.
 * Removed `databricks.sdk.service.sql.TimeoutAction` dataclass.
 * Added `databricks.sdk.service.sql.AlertOptionsEmptyResultState` dataclass.
 * Added `databricks.sdk.service.sql.BaseChunkInfo` dataclass.
 * Added `databricks.sdk.service.sql.ExecuteStatementRequestOnWaitTimeout` dataclass.

OpenAPI SHA: b52a3b410976501f08f76ca0b355fb2dca876953, Date: 2023-09-15
@tanmay-db tanmay-db mentioned this pull request Sep 20, 2023
github-merge-queue bot pushed a commit that referenced this pull request Sep 20, 2023
* Don't try to import runtime_auth when not in runtime
([#327](#327)).
* Handled Azure authentication when WorkspaceResourceID is provided
([#328](#328)).
* Added ErrorInfo to API errors
([#347](#347)).
* Fixed eager default argument evaluation in `DatabricksError`
([#353](#353)).
* Fixed code generation of primitive types
([#354](#354)).
* Updated SDK to changes in OpenAPI specification
([#355](#355)).

API Changes:

* Changed `list()` method for
[a.account_metastore_assignments](https://databricks-sdk-py.readthedocs.io/en/latest/account/account_metastore_assignments.html)
account-level service to return
`databricks.sdk.service.catalog.WorkspaceIdList` dataclass.
* Changed `artifact_matchers` field for
`databricks.sdk.service.catalog.ArtifactAllowlistInfo` to
`databricks.sdk.service.catalog.ArtifactMatcherList` dataclass.
* Changed `artifact_matchers` field for
`databricks.sdk.service.catalog.SetArtifactAllowlist` to
`databricks.sdk.service.catalog.ArtifactMatcherList` dataclass.
 * Added `databricks.sdk.service.catalog.WorkspaceId` dataclass.
* Changed `cancel_all_runs()` method for
[w.jobs](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/jobs.html)
workspace-level service with new required argument order.
* Changed `job_id` field for `databricks.sdk.service.jobs.CancelAllRuns`
to no longer be required.
* Added `all_queued_runs` field for
`databricks.sdk.service.jobs.CancelAllRuns`.
 * Added `queue` field for `databricks.sdk.service.jobs.CreateJob`.
 * Added `queue` field for `databricks.sdk.service.jobs.JobSettings`.
 * Added `queue` field for `databricks.sdk.service.jobs.RunNow`.
* Added `queue_reason` field for `databricks.sdk.service.jobs.RunState`.
* Added `queue_duration` field for
`databricks.sdk.service.jobs.RunTask`.
 * Added `queue` field for `databricks.sdk.service.jobs.SubmitRun`.
 * Added `databricks.sdk.service.jobs.QueueSettings` dataclass.
* Added
[a.o_auth_published_apps](https://databricks-sdk-py.readthedocs.io/en/latest/account/o_auth_published_apps.html)
account-level service.
* Added `databricks.sdk.service.oauth2.GetPublishedAppsOutput`
dataclass.
* Added `databricks.sdk.service.oauth2.ListOAuthPublishedAppsRequest`
dataclass.
 * Added `databricks.sdk.service.oauth2.PublishedAppOutput` dataclass.
* Added `patch()` method for
[w.serving_endpoints](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/serving_endpoints.html)
workspace-level service.
* Added `tags` field for
`databricks.sdk.service.serving.CreateServingEndpoint`.
* Added `tags` field for
`databricks.sdk.service.serving.ServingEndpoint`.
* Added `tags` field for
`databricks.sdk.service.serving.ServingEndpointDetailed`.
 * Added `databricks.sdk.service.serving.EndpointTag` dataclass.
* Added `databricks.sdk.service.serving.PatchServingEndpointTags`
dataclass.
* Added
[w.credentials_manager](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/credentials_manager.html)
workspace-level service.
 * Added `databricks.sdk.service.settings.ExchangeToken` dataclass.
* Added `databricks.sdk.service.settings.ExchangeTokenRequest`
dataclass.
* Added `databricks.sdk.service.settings.ExchangeTokenResponse`
dataclass.
 * Added `databricks.sdk.service.settings.PartitionId` dataclass.
 * Added `databricks.sdk.service.settings.TokenType` dataclass.
* Changed `execute_statement()` method for
[w.statement_execution](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/statement_execution.html)
workspace-level service with new required argument order.
* Added `empty_result_state` field for
`databricks.sdk.service.sql.AlertOptions`.
 * Removed `databricks.sdk.service.sql.ChunkInfo` dataclass.
* Changed `on_wait_timeout` field for
`databricks.sdk.service.sql.ExecuteStatementRequest` to
`databricks.sdk.service.sql.ExecuteStatementRequestOnWaitTimeout`
dataclass.
* Changed `statement` field for
`databricks.sdk.service.sql.ExecuteStatementRequest` to be required.
* Changed `warehouse_id` field for
`databricks.sdk.service.sql.ExecuteStatementRequest` to be required.
* Changed `chunks` field for `databricks.sdk.service.sql.ResultManifest`
to `databricks.sdk.service.sql.BaseChunkInfoList` dataclass.
* Added `truncated` field for
`databricks.sdk.service.sql.ResultManifest`.
 * Removed `databricks.sdk.service.sql.TimeoutAction` dataclass.
* Added `databricks.sdk.service.sql.AlertOptionsEmptyResultState`
dataclass.
 * Added `databricks.sdk.service.sql.BaseChunkInfo` dataclass.
* Added
`databricks.sdk.service.sql.ExecuteStatementRequestOnWaitTimeout`
dataclass.

OpenAPI SHA: b52a3b410976501f08f76ca0b355fb2dca876953, Date: 2023-09-15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants