-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added w.config.account_host
to get the relevant account host from a workspace client
#390
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified lines are covered by tests ✅
📢 Thoughts on this report? Let us know!. |
databricks/sdk/azure.py
Outdated
active_directory_endpoint="https://login.microsoftonline.de/"), | ||
GERMAN=AzureEnvironment( | ||
name="AzureGermanCloud", | ||
databricks_hostname_suffix='.databricks.azure.de', # TODO: i'm not sure about this one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is OK because we don't run Databricks in this cloud (and it is being deprecated: https://learn.microsoft.com/en-us/previous-versions/azure/germany/germany-welcome)
@@ -617,8 +613,31 @@ def is_account_client(self) -> bool: | |||
return False | |||
return self.host.startswith("https://accounts.") or self.host.startswith("https://accounts-dod.") | |||
|
|||
@property | |||
def account_host(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work for dev/staging, right? We occasionally get internal issues where the SDK doesn't work well with dev/staging.
Also, can we label this somehow as experimental? In the future, if we introduce a workspace-level metadata service, the accounts endpoint can be exposed there, and we can remove this property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you suggest on the pattern here to make it work for dev/staging?
btw, once we introduce workspace-level metadata service, we should just print warnings from within this call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea: have a list to lookup cloud/env/account host name based on domain suffix, from most-specific to least-specific:
# Most specific to least specific
[
('.dev.databricks.com', 'AWS', 'DEV', 'PUBLIC', 'accounts.dev.databricks.com'),
('.staging.cloud.databricks.com', 'AWS', 'STAGING', 'PUBLIC', 'accounts.staging.cloud.databricks.com'),
('.cloud.databricks.com', 'AWS', 'PROD', 'PUBLIC', 'accounts.cloud.databricks.com'),
('.cloud.databricks.us', 'AWS', 'PROD', 'GOV', 'accounts.cloud.databricks.us'),
('.dev.azuredatabricks.net', 'AZURE', 'DEV', 'PUBLIC', 'accounts.dev.azuredatabricks.net')
('.staging.azuredatabricks.net', 'AZURE', 'STAGING', 'PUBLIC', 'accounts.staging.azuredatabricks.net'),
('.azuredatabricks.net', 'AZURE', 'PROD', 'PUBLIC', 'accounts.azuredatabricks.net'),
('.databricks.azure.us', 'AZURE', 'PROD', 'GOV', 'accounts.databricks.azure.us'),
('.dev.gcp.databricks.com', 'GCP', 'DEV', 'PUBLIC', 'accounts.dev.gcp.databricks.com'),
('.staging.gcp.databricks.com', 'GCP', 'STAGING', 'PUBLIC', 'accounts.staging.gcp.databricks.com'),
('.gcp.databricks.com', 'GCP', 'PROD', 'PUBLIC', 'accounts.gcp.databricks.com'),
]
The simpler version is to just replace the most specific zone with accounts
, which works everywhere today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgyucht do we want to expose dev/staging?... if we do - then why don't we expose the azure_login_app_id
along with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's definitely fine by me!
One nit to make this work for dev/staging & label as experimental, but otherwise this seems reasonable to me if it unblocks you/customers. |
all the info here is fine to be public ( and already public) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you for contributing! Couple nits/suggestions, otherwise this looks good to me.
@@ -617,8 +613,31 @@ def is_account_client(self) -> bool: | |||
return False | |||
return self.host.startswith("https://accounts.") or self.host.startswith("https://accounts-dod.") | |||
|
|||
@property | |||
def account_host(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's definitely fine by me!
databricks/sdk/core.py
Outdated
cloud: Cloud | ||
|
||
# domain suffixes are not very secret: https://crt.sh/?q=databricks | ||
tld: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's call this dns_zone
(I think TLD would traditionally refer to com.
or net.
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dns_zone
it is then.
databricks/sdk/core.py
Outdated
def deployment(self, deployment_name: str) -> str: | ||
return f'https://{deployment_name}{self.tld}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is maybe just a bit abstract. It won't work in Azure the way a user might expect, where the DNS name can't be determined from the workspace ID alone. Given we're only using this for account host, I suggest we just inline it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's already part of our response:
for now, we work it around in https://github.com/databrickslabs/ucx/blob/main/src/databricks/labs/ucx/account/workspaces.py#L101-L126, but @iaroslav-db fixed the mapping so that we don't have to work it around soon ;)
I need this function to create a workspace client from an account client. See:
- https://github.com/databrickslabs/watchdog/blob/main/scan/scan.go#L144-L154
- https://github.com/databrickslabs/ucx/blob/main/src/databricks/labs/ucx/account/workspaces.py#L122-L126
- https://github.com/databricks/terraform-provider-databricks/blob/baa61c451055271e14e7a9bc3db43537dcfda8ea/mws/resource_mws_workspaces.go#L372-L377
I'll do a follow-up PR to account client once azure changes are rolled out to prod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh perfect, I actually didn't realize those workspace APIs worked in azure yet.
… workspace client This simplifies downstream integration testing as well as configuration introduce `DatabricksEnvironment` ..
bea774f
to
cb830fc
Compare
this needs a bit more adjustments to work |
Are similar PRs for the other SDKs tracked some place? |
@pietern i'll have one for GoLang the following week - once i figure out the way to have it friendly for unit tests, need the same thing for two projects |
…ere IMDS doesn't give host environment information (#700) ## Changes This PR allows determining Azure Environment from a Databricks account or workspace hostname, removing the need for a separate configuration/environment variable and complexities related to Azure MSI from within ACR. Similar functionality in Python SDK: - databricks/databricks-sdk-py#390 Stacked on top of: - #699 ## Tests <!-- How is this tested? Please see the checklist below and also describe any other relevant tests --> - [x] `make test` passing - [x] `make fmt` applied - [ ] relevant integration tests applied
Changes
This simplifies downstream integration testing as well as configuration
Tests
make test
run locallymake fmt
applied