Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved notebook-native authentication #152

Merged
merged 3 commits into from
Jun 13, 2023
Merged

Improved notebook-native authentication #152

merged 3 commits into from
Jun 13, 2023

Conversation

nfx
Copy link
Contributor

@nfx nfx commented Jun 7, 2023

Ported from MLflow codebase:
image

Resolves #153.

Copy link

@MrBago MrBago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move all the init_runtime_native_auth and databricks_repl_context logic to a new file, there is no reason any of this needs to be databricks.sdk.runtime. The auth stuff and the runtime stuff are distinct and we explicitly do not want them to interact. All this code should work the same in another file.

databricks/sdk/runtime/__init__.py Outdated Show resolved Hide resolved
databricks/sdk/runtime/__init__.py Outdated Show resolved Hide resolved
try:
from dbruntime.databricks_repl_context import get_context
ctx = get_context()
if ctx is not None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a static check on ctx done on import, but get_context will not return a Context until a command has been run. This probably works because databricks.sdk.rutnime is probably not imported until a user interacts with the sdk, but this is pretty brittle and will break if we internally import this file on startup in the future.

Can we make this check either dynamic or at if we want to check once and then give up, could we be more explcit about when the check is done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in f371865 - we need a hostname during the SDK initialization stage. technically, if we can start every runtime with DATABRICKS_HOST environment variable with the f'https://{ctx.browserHostName}', we can simplify the init_runtime_native_auth return type.

please remember - init_runtime_native_auth returns the workspace hostname and function that returns a dictionary of headers. this is to support future improvements to REPL authentication, where we could leverage token refreshes

@nfx nfx requested a review from MrBago June 8, 2023 09:00
@codecov-commenter
Copy link

codecov-commenter commented Jun 8, 2023

Codecov Report

Patch coverage: 35.71% and project coverage change: +0.01 🎉

Comparison is base (038e48a) 53.23% compared to head (b95af59) 53.25%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #152      +/-   ##
==========================================
+ Coverage   53.23%   53.25%   +0.01%     
==========================================
  Files          29       30       +1     
  Lines       17917    18190     +273     
==========================================
+ Hits         9539     9687     +148     
- Misses       8378     8503     +125     
Impacted Files Coverage Δ
databricks/sdk/core.py 66.43% <35.71%> (-1.29%) ⬇️

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

host, inner = init_runtime_native_auth()
cfg.host = host
return inner
try:
from dbruntime.databricks_repl_context import get_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nfx may I know whether you verify in notebook environment (e.g. with DBR 12.2) that it works? I vaguely remember last time I want to import from dbruntime in core.py it does not work. Ignore me if I'm wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested it on couple of DBRs

cfg.host = f'https://{ctx.browserHostName}'

def inner() -> Dict[str, str]:
return {'Authorization': f'Bearer {ctx.apiToken}'}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we lock in the apiToken? I think we should call get_context() inside inner to swap the context for different users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaochen-db also running get_context() within the inner? thanks for pointing this out! will add

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in b95af59

@nfx nfx requested review from xiaochen-db and dby-tmwctw June 12, 2023 17:53
host, inner = init_runtime_native_auth()
cfg.host = host
return inner
try:
from dbruntime.databricks_repl_context import get_context
ctx = get_context()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to double check when is this called? Inside a command or during REPL initialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both

@nfx nfx merged commit 8dacd83 into main Jun 13, 2023
@nfx nfx deleted the notebook-native-auth branch June 13, 2023 16:38
nfx added a commit that referenced this pull request Jun 15, 2023
* Regenerate from OpenAPI spec ([#176](#176)).
* Added improved notebook-native authentication ([#152](#152)).
* Added methods to provide extra user agent and upstream user agent to SDK config ([#163](#163)).
* Added more missing `Optional` type hints ([#171](#171)).
* Add more missing optional fields ([#177](#177)).
* Correctly serialize external entities ([#178](#178)).
* Correctly serialize external enum values in paths ([#179](#179)).
* Mark non-required fields as `Optional` ([#170](#170)).
* Synchronize auth permutation tests with Go SDK ([#165](#165)).
@nfx nfx mentioned this pull request Jun 15, 2023
1 task
nfx added a commit that referenced this pull request Jun 15, 2023
* Regenerate from OpenAPI spec
([#176](#176)).
* Added improved notebook-native authentication
([#152](#152)).
* Added methods to provide extra user agent and upstream user agent to
SDK config
([#163](#163)).
* Added more missing `Optional` type hints
([#171](#171)).
* Added more missing optional fields
([#177](#177)).
* Correctly serialize external entities
([#178](#178)).
* Correctly serialize external enum values in paths
([#179](#179)).
* Mark non-required fields as `Optional`
([#170](#170)).
* Synchronize auth permutation tests with Go SDK
([#165](#165)).

## Tests

- [x] relevant integration tests applied

---------

Signed-off-by: Serge Smertin <259697+nfx@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use the sdk from within databricks?
7 participants