Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DECO-2485] Handle Azure authentication when WorkspaceResourceID is provided #328

Merged
merged 4 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 38 additions & 4 deletions databricks/sdk/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -253,8 +253,11 @@ def refresh(self) -> Token:
class AzureCliTokenSource(CliTokenSource):
""" Obtain the token granted by `az login` CLI command """

def __init__(self, resource: str):
def __init__(self, resource: str, subscription: str = ""):
cmd = ["az", "account", "get-access-token", "--resource", resource, "--output", "json"]
if subscription != "":
cmd.append("--subscription")
cmd.append(subscription)
Comment on lines +259 to +260
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can use extend if we are adding more than one item. It would be more readable.

super().__init__(cmd=cmd,
token_type_field='tokenType',
access_token_field='accessToken',
Expand All @@ -264,8 +267,8 @@ def __init__(self, resource: str):
@credentials_provider('azure-cli', ['is_azure'])
def azure_cli(cfg: 'Config') -> Optional[HeaderFactory]:
""" Adds refreshed OAuth token granted by `az login` command to every request. """
token_source = AzureCliTokenSource(cfg.effective_azure_login_app_id)
mgmt_token_source = AzureCliTokenSource(cfg.arm_environment.service_management_endpoint)
token_source = get_token(cfg, cfg.effective_azure_login_app_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move error-handling logic to for_resource

mgmt_token_source = get_token(cfg, cfg.arm_environment.service_management_endpoint)
try:
token_source.token()
except FileNotFoundError:
Expand All @@ -278,7 +281,7 @@ def azure_cli(cfg: 'Config') -> Optional[HeaderFactory]:
logger.debug(f'Not including service management token in headers', exc_info=e)
mgmt_token_source = None

_ensure_host_present(cfg, lambda resource: AzureCliTokenSource(resource))
_ensure_host_present(cfg, lambda resource: get_token(cfg, resource))
logger.info("Using Azure CLI authentication with AAD tokens")

def inner() -> Dict[str, str]:
Expand All @@ -292,6 +295,37 @@ def inner() -> Dict[str, str]:
return inner


def get_token(cfg: 'Config', resource: str) -> AzureCliTokenSource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't this validation be moved to AzureCliTokenSource constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean adding a try/except block around the "super().init" code. Not being that familiar with Python, and coming from languages like Java which don't allow you to do that, it seemed like a bad idea to me. But I can add it if it is considered acceptable practice in Python.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you then try adding validate (or validate_subscription, or something like that) method on AzureCliTokenSource and call it instead of the first call to token.

    try:
        token_source.token()
    except FileNotFoundError:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, the IncorrectClaimException in the original ticket does refer to Azure Tenant (identities), not the Azure Subscription (resources). there are valid cases, when user has no access to subscription, but has access to a workspace. also, azure_workspace_resource_id is not required for Azure CLI auth type.

to solve IncorrectClaimException we have to supply azure_login_app_id. I'll reach to you on slack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staticmethod
def for_resource(cfg: 'Config', resource: str) -> 'AzureCliTokenSource':
    ...

AzureCliTokenSource.for_resource(...)

subscription = get_subscription(cfg)
if subscription != "":
token = AzureCliTokenSource(resource, subscription)
try:
# This will fail if the user has access to the workspace, but not to the subscription
# itself.
# In such case, we fall back to not using the subscription.
token.token()
return token
except OSError:
logger.warning("Failed to get token for subscription. Using resource only token.")
else:
logger.warning(
"azure_workspace_resource_id field not provided. " +
"It is recommended to specify this field in the Databricks configuration to avoid authentication errors."
)
return AzureCliTokenSource(resource)


def get_subscription(cfg: 'Config') -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should move as a private method to AzureCliTokenSource as well

resource = cfg.azure_workspace_resource_id
if resource == None or resource == "":
return ""
components = resource.split('/')
if len(components) < 3:
logger.warning("Invalid azure workspace resource ID")
return ""
return components[2]


class DatabricksCliTokenSource(CliTokenSource):
""" Obtain the token granted by `databricks auth login` CLI command """

Expand Down
9 changes: 9 additions & 0 deletions tests/test_auth_manual_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,12 @@ def test_azure_cli_user_no_management_access(monkeypatch):
resource_id = '/subscriptions/123/resourceGroups/abc/providers/Microsoft.Databricks/workspaces/abc123'
cfg = Config(auth_type='azure-cli', host='x', azure_workspace_resource_id=resource_id)
assert 'X-Databricks-Azure-SP-Management-Token' not in cfg.authenticate()


def test_azure_cli_fallback(monkeypatch):
monkeypatch.setenv('HOME', __tests__ + '/testdata/azure')
monkeypatch.setenv('PATH', __tests__ + '/testdata:/bin')
monkeypatch.setenv('FAIL_IF', 'subscription')
resource_id = '/subscriptions/123/resourceGroups/abc/providers/Microsoft.Databricks/workspaces/abc123'
cfg = Config(auth_type='azure-cli', host='x', azure_workspace_resource_id=resource_id)
assert 'X-Databricks-Azure-SP-Management-Token' in cfg.authenticate()