Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make notebook-native auth work with more configurations of the Databricks Runtime #285

Merged
merged 5 commits into from
Aug 17, 2023

Conversation

nfx
Copy link
Contributor

@nfx nfx commented Aug 14, 2023

This PR adds additional logging and hardening to auth_type='runtime', which performs credentials lookup in the following order:

  1. init_runtime_native_auth for the newest DBR versions.
  2. init_runtime_repl_auth via Databricks REPL context and workspaceUrl.
  3. init_runtime_legacy_auth via IPython context for legacy runtimes and modes.

Every detection step adds more logging on DEBUG level.

This PR adds nightly integration testing for all LTS runtimes, as well as testing the latest runtime with UC data access modes.

Based on publicly-accessible code in https://github.com/mlflow/mlflow/blame/6bd97bde24d78bcfbf6d50c1dd0f4fac2ed6987b/mlflow/utils/databricks_utils.py

…icks Runtime

This PR adds additional logging and hardening to `auth_type='runtime'`, which performs credentials lookup in the following order:

1. `init_runtime_native_auth` for the newest DBR versions.
2. `init_runtime_repl_auth` via Databricks REPL context and `workspaceUrl`.
3. `init_runtime_legacy_auth` via IPython context for legacy runtimes and modes.

Every detection step adds more logging on `DEBUG` level.

Based on publicly-accessible code in https://github.com/mlflow/mlflow/blame/6bd97bde24d78bcfbf6d50c1dd0f4fac2ed6987b/mlflow/utils/databricks_utils.py
@nfx
Copy link
Contributor Author

nfx commented Aug 14, 2023

TODO: add a job to integration test native auth for different runtimes.

Copy link
Contributor

@dby-tmwctw dby-tmwctw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Can we include somewhere the list of DBR versions/access control modes that this PR enables?


def inner() -> Dict[str, str]:
ctx = dbutils.notebook().getContext()
return {'Authorization': f'Bearer {getattr(ctx, "apiToken")().get()}'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this equivalent?

Suggested change
return {'Authorization': f'Bearer {getattr(ctx, "apiToken")().get()}'}
return {'Authorization': f'Bearer {ctx.apiToken().get()}'}

ctx = dbutils.notebook().getContext()
if ctx is None:
return None, None
host = getattr(ctx, 'apiUrl')().get()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this equivalent?

Suggested change
host = getattr(ctx, 'apiUrl')().get()
host = ctx.apiUrl().get()

@nfx nfx removed the do-not-merge label Aug 17, 2023
@nfx nfx added this pull request to the merge queue Aug 17, 2023
Merged via the queue into main with commit 9aea9a0 Aug 17, 2023
@nfx nfx deleted the fix/runtime-auth branch August 17, 2023 13:32
mgyucht added a commit that referenced this pull request Aug 17, 2023
* Added collection of Databricks Runtime versions used together with Python SDK ([#287](#287)).
* Applied attribute transformer when reading in attributes from the environment ([#293](#293)).
* Made notebook-native auth work with more configurations of the Databricks Runtime ([#285](#285)).
* Added retry in `w.clusters.ensure_cluster_is_running(id)` when cluster is simultaneously started by two different processes. ([#283](#283)).
* Set necessary headers when authenticating via Azure CLI ([#290](#290)).
* Updated classifier to `Development Status :: 4 - Beta` ([#291](#291)).
* Introduced Artifact Allowlist, Securable Tags, and Subentity Tags services.
* Introduced DeleteRuns and RestoreRuns methods in the Experiments API.
* Introduced the GetSecret method in the Secrets API.
* Renamed Auto Maintenance to Predictive Optimization.

New Services:

 * Added [w.artifact_allowlists](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/artifact_allowlists.html) workspace-level service.
 * Added [w.securable_tags](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/securable_tags.html) workspace-level service.
 * Added [w.subentity_tags](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/subentity_tags.html) workspace-level service.
 * Added `databricks.sdk.service.catalog.ArtifactAllowlistInfo` dataclass.
 * Added `databricks.sdk.service.catalog.ArtifactMatcher` dataclass.
 * Added `databricks.sdk.service.catalog.ArtifactType` dataclass.
 * Added `databricks.sdk.service.catalog.GetArtifactAllowlistRequest` dataclass.
 * Added `databricks.sdk.service.catalog.ListSecurableTagsRequest` dataclass.
 * Added `databricks.sdk.service.catalog.ListSecurableType` dataclass.
 * Added `databricks.sdk.service.catalog.ListSubentityTagsRequest` dataclass.
 * Added `databricks.sdk.service.catalog.MatchType` dataclass.
 * Added `databricks.sdk.service.catalog.SetArtifactAllowlist` dataclass.
 * Added `databricks.sdk.service.catalog.TagChanges` dataclass.
 * Added `databricks.sdk.service.catalog.TagKeyValuePair` dataclass.
 * Added `databricks.sdk.service.catalog.TagSecurable` dataclass.
 * Added `databricks.sdk.service.catalog.TagSecurableAssignment` dataclass.
 * Added `databricks.sdk.service.catalog.TagSecurableAssignmentsList` dataclass.
 * Added `databricks.sdk.service.catalog.TagSubentity` dataclass.
 * Added `databricks.sdk.service.catalog.TagSubentityAssignmentsList` dataclass.
 * Added `databricks.sdk.service.catalog.TagsSubentityAssignment` dataclass.
 * Added `databricks.sdk.service.catalog.UpdateSecurableType` dataclass.
 * Added `databricks.sdk.service.catalog.UpdateTags` dataclass.

New APIs:

 * Added `delete_runs()` method for [w.experiments](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/experiments.html) workspace-level service.
 * Added `restore_runs()` method for [w.experiments](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/experiments.html) workspace-level service.
 * Added `databricks.sdk.service.ml.DeleteRuns` dataclass.
 * Added `databricks.sdk.service.ml.DeleteRunsResponse` dataclass.
 * Added `databricks.sdk.service.ml.RestoreRuns` dataclass.
 * Added `databricks.sdk.service.ml.RestoreRunsResponse` dataclass.
 * Added `get_secret()` method for [w.secrets](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/secrets.html) workspace-level service.
 * Added `databricks.sdk.service.workspace.GetSecretRequest` dataclass.
 * Added `databricks.sdk.service.workspace.GetSecretResponse` dataclass.

Service Renames:

 * Removed `effective_auto_maintenance_flag` field for `databricks.sdk.service.catalog.CatalogInfo`.
 * Removed `enable_auto_maintenance` field for `databricks.sdk.service.catalog.CatalogInfo`.
 * Added `effective_predictive_optimization_flag` field for `databricks.sdk.service.catalog.CatalogInfo`.
 * Added `enable_predictive_optimization` field for `databricks.sdk.service.catalog.CatalogInfo`.
 * Removed `databricks.sdk.service.catalog.EffectiveAutoMaintenanceFlag` dataclass.
 * Removed `databricks.sdk.service.catalog.EffectiveAutoMaintenanceFlagInheritedFromType` dataclass.
 * Removed `databricks.sdk.service.catalog.EnableAutoMaintenance` dataclass.
 * Removed `effective_auto_maintenance_flag` field for `databricks.sdk.service.catalog.SchemaInfo`.
 * Removed `enable_auto_maintenance` field for `databricks.sdk.service.catalog.SchemaInfo`.
 * Added `effective_predictive_optimization_flag` field for `databricks.sdk.service.catalog.SchemaInfo`.
 * Added `enable_predictive_optimization` field for `databricks.sdk.service.catalog.SchemaInfo`.
 * Removed `effective_auto_maintenance_flag` field for `databricks.sdk.service.catalog.TableInfo`.
 * Removed `enable_auto_maintenance` field for `databricks.sdk.service.catalog.TableInfo`.
 * Added `effective_predictive_optimization_flag` field for `databricks.sdk.service.catalog.TableInfo`.
 * Added `enable_predictive_optimization` field for `databricks.sdk.service.catalog.TableInfo`.
 * Added `databricks.sdk.service.catalog.EffectivePredictiveOptimizationFlag` dataclass.
 * Added `databricks.sdk.service.catalog.EffectivePredictiveOptimizationFlagInheritedFromType` dataclass.
 * Added `databricks.sdk.service.catalog.EnablePredictiveOptimization` dataclass.

OpenAPI SHA: beff621d7b3e1d59244e2e34fc53a496f310e130, Date: 2023-08-17
@mgyucht mgyucht mentioned this pull request Aug 17, 2023
github-merge-queue bot pushed a commit that referenced this pull request Aug 17, 2023
* Added collection of Databricks Runtime versions used together with
Python SDK
([#287](#287)).
* Applied attribute transformer when reading in attributes from the
environment
([#293](#293)).
* Made notebook-native auth work with more configurations of the
Databricks Runtime
([#285](#285)).
* Added retry in `w.clusters.ensure_cluster_is_running(id)` when cluster
is simultaneously started by two different processes.
([#283](#283)).
* Set necessary headers when authenticating via Azure CLI
([#290](#290)).
* Updated classifier to `Development Status :: 4 - Beta`
([#291](#291)).
* Introduced Artifact Allowlist, Securable Tags, and Subentity Tags
services.
* Introduced DeleteRuns and RestoreRuns methods in the Experiments API.
* Introduced the GetSecret method in the Secrets API.
* Renamed Auto Maintenance to Predictive Optimization.

New Services:

* Added
[w.artifact_allowlists](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/artifact_allowlists.html)
workspace-level service.
* Added
[w.securable_tags](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/securable_tags.html)
workspace-level service.
* Added
[w.subentity_tags](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/subentity_tags.html)
workspace-level service.
* Added `databricks.sdk.service.catalog.ArtifactAllowlistInfo`
dataclass.
 * Added `databricks.sdk.service.catalog.ArtifactMatcher` dataclass.
 * Added `databricks.sdk.service.catalog.ArtifactType` dataclass.
* Added `databricks.sdk.service.catalog.GetArtifactAllowlistRequest`
dataclass.
* Added `databricks.sdk.service.catalog.ListSecurableTagsRequest`
dataclass.
 * Added `databricks.sdk.service.catalog.ListSecurableType` dataclass.
* Added `databricks.sdk.service.catalog.ListSubentityTagsRequest`
dataclass.
 * Added `databricks.sdk.service.catalog.MatchType` dataclass.
* Added `databricks.sdk.service.catalog.SetArtifactAllowlist` dataclass.
 * Added `databricks.sdk.service.catalog.TagChanges` dataclass.
 * Added `databricks.sdk.service.catalog.TagKeyValuePair` dataclass.
 * Added `databricks.sdk.service.catalog.TagSecurable` dataclass.
* Added `databricks.sdk.service.catalog.TagSecurableAssignment`
dataclass.
* Added `databricks.sdk.service.catalog.TagSecurableAssignmentsList`
dataclass.
 * Added `databricks.sdk.service.catalog.TagSubentity` dataclass.
* Added `databricks.sdk.service.catalog.TagSubentityAssignmentsList`
dataclass.
* Added `databricks.sdk.service.catalog.TagsSubentityAssignment`
dataclass.
 * Added `databricks.sdk.service.catalog.UpdateSecurableType` dataclass.
 * Added `databricks.sdk.service.catalog.UpdateTags` dataclass.

New APIs:

* Added `delete_runs()` method for
[w.experiments](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/experiments.html)
workspace-level service.
* Added `restore_runs()` method for
[w.experiments](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/experiments.html)
workspace-level service.
 * Added `databricks.sdk.service.ml.DeleteRuns` dataclass.
 * Added `databricks.sdk.service.ml.DeleteRunsResponse` dataclass.
 * Added `databricks.sdk.service.ml.RestoreRuns` dataclass.
 * Added `databricks.sdk.service.ml.RestoreRunsResponse` dataclass.
* Added `get_secret()` method for
[w.secrets](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/secrets.html)
workspace-level service.
 * Added `databricks.sdk.service.workspace.GetSecretRequest` dataclass.
 * Added `databricks.sdk.service.workspace.GetSecretResponse` dataclass.

Service Renames:

* Removed `effective_auto_maintenance_flag` field for
`databricks.sdk.service.catalog.CatalogInfo`.
* Removed `enable_auto_maintenance` field for
`databricks.sdk.service.catalog.CatalogInfo`.
* Added `effective_predictive_optimization_flag` field for
`databricks.sdk.service.catalog.CatalogInfo`.
* Added `enable_predictive_optimization` field for
`databricks.sdk.service.catalog.CatalogInfo`.
* Removed `databricks.sdk.service.catalog.EffectiveAutoMaintenanceFlag`
dataclass.
* Removed
`databricks.sdk.service.catalog.EffectiveAutoMaintenanceFlagInheritedFromType`
dataclass.
* Removed `databricks.sdk.service.catalog.EnableAutoMaintenance`
dataclass.
* Removed `effective_auto_maintenance_flag` field for
`databricks.sdk.service.catalog.SchemaInfo`.
* Removed `enable_auto_maintenance` field for
`databricks.sdk.service.catalog.SchemaInfo`.
* Added `effective_predictive_optimization_flag` field for
`databricks.sdk.service.catalog.SchemaInfo`.
* Added `enable_predictive_optimization` field for
`databricks.sdk.service.catalog.SchemaInfo`.
* Removed `effective_auto_maintenance_flag` field for
`databricks.sdk.service.catalog.TableInfo`.
* Removed `enable_auto_maintenance` field for
`databricks.sdk.service.catalog.TableInfo`.
* Added `effective_predictive_optimization_flag` field for
`databricks.sdk.service.catalog.TableInfo`.
* Added `enable_predictive_optimization` field for
`databricks.sdk.service.catalog.TableInfo`.
* Added
`databricks.sdk.service.catalog.EffectivePredictiveOptimizationFlag`
dataclass.
* Added
`databricks.sdk.service.catalog.EffectivePredictiveOptimizationFlagInheritedFromType`
dataclass.
* Added `databricks.sdk.service.catalog.EnablePredictiveOptimization`
dataclass.

OpenAPI SHA: beff621d7b3e1d59244e2e34fc53a496f310e130, Date: 2023-08-17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants