Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default client on gce instances #191

Closed
amardeep opened this issue Jan 20, 2022 · 6 comments · Fixed by #192
Closed

Default client on gce instances #191

amardeep opened this issue Jan 20, 2022 · 6 comments · Fixed by #192

Comments

@amardeep
Copy link

With GOOGLE_APPLICATION_CREDENTIALS not set on gce instances, cloudpathlib operations such as is_dir fail:

from cloudpathlib import CloudPath
p = CloudPath("gs://<some non-public path>")
p.is_dir()

fails with:

/lib/python3.7/site-packages/google/auth/credentials.py in refresh(self, request)
    171         """Raises :class:`ValueError``, anonymous credentials cannot be
    172         refreshed."""
--> 173         raise ValueError("Anonymous credentials cannot be refreshed.")
    174 
    175     def apply(self, headers, token=None):

ValueError: Anonymous credentials cannot be refreshed

On gce instances, storage client doesn't need any further auth setup. For eg., without GOOGLE_APPLICATION_CREDENTIALS set, this works:

from google.cloud import storage
storage_client = storage.Client()
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)

and this also works:

from cloudpathlib import CloudPath
from google.cloud import storage

gs_client = cloudpathlib.GSClient(storage_client=storage.Client())
gs_client.set_as_default_client()

p = CloudPath("gs://<some non-public path>")
p.is_dir()

It would be nice if this extra setup is not needed.
Maybe while constructing it can try creating non-anonymous storage client and if it fails create anonymous one.

@pjbull
Copy link
Member

pjbull commented Jan 20, 2022

Thanks @amardeep! Do you know the reliable way to test creating the client? Will the following raise an exception if there are no creds, or do we need to actually try to do something with the client?

from google.cloud import storage
storage_client = storage.Client()

If that raises, then we can add a try except in the else block here and have anonymous be the final fallback:

https://github.com/drivendataorg/cloudpathlib/blob/master/cloudpathlib/gs/gsclient.py#L70-L77

@amardeep
Copy link
Author

Trying following code on non-gce and gce instances:

from google.cloud import storage
storage_client = storage.Client()

It succeeds on gce instances or non-gce instances with gcloud auth login done.

On non-gce instances with no gcloud auth login or google credentials in environment, it fails with following trace:

Traceback (most recent call last):
  File "/root/test_storage_client.py", line 2, in <module>
    storage_client = storage.Client()
  File "/usr/local/lib/python3.9/dist-packages/google/cloud/storage/client.py", line 159, in __init__
    super(Client, self).__init__(
  File "/usr/local/lib/python3.9/dist-packages/google/cloud/client/__init__.py", line 318, in __init__
    _ClientProjectMixin.__init__(self, project=project, credentials=credentials)
  File "/usr/local/lib/python3.9/dist-packages/google/cloud/client/__init__.py", line 266, in __init__
    project = self._determine_default(project)
  File "/usr/local/lib/python3.9/dist-packages/google/cloud/client/__init__.py", line 285, in _determine_default
    return _determine_default_project(project)
  File "/usr/local/lib/python3.9/dist-packages/google/cloud/_helpers/__init__.py", line 154, in _determine_default_project
    _, project = google.auth.default()
  File "/usr/local/lib/python3.9/dist-packages/google/auth/_default.py", line 493, in default
    raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
r

@pjbull
Copy link
Member

pjbull commented Jan 21, 2022

@amardeep Thanks! We have a potential fix. Would you mind testing it on your GCE env and non-GCE env?

You can install the branch with the fix by doing:

pip install "git+https://github.com/drivendataorg/cloudpathlib@191-gce-default-client#egg=cloudpathlib[gs]"

@amardeep
Copy link
Author

Thanks. Change looks fine. I tested it and it works with one issue which is probably an issue with storage client library.

On non-gce machine with no credentials - it works fine with public buckets. For eg:

from cloudpathlib import CloudPath
p = CloudPath("gs://tfds-data/datasets")
p1 = p / "mnist"
print(list(p1.glob("*")))
p2 = CloudPath("gs://<some privtate bucket>/")
print(list(p2.glob("*")))

First works and second fails as expected.

On gce machine, both work as expected.

On non-gce machine, with GOOGLE_APPLICATION_CREDENTIALS set, both work.

On non-gce machine without GOOGLE_APPLICATION_CREDENTIALS but with gcloud auth login, it fails with following error, but i think it is an issue with storage client library, so the current change should be good.

File ".venv/lib/python3.9/site-packages/google/oauth2/_client.py", line 60, in _handle_error_response
    raise exceptions.RefreshError(error_details, response_data)
google.auth.exceptions.RefreshError: ('invalid_grant: Token has been expired or revoked.', {'error': 'invalid_grant', 'error_description': 'Token has been expired or revoked.'})

@pjbull
Copy link
Member

pjbull commented Jan 23, 2022

Thanks for testing @amardeep. We'll see if we get any reported issues on that last scenario—maybe there's another branch to the logic here, but it seems like this fix should make more scenarios work smoothly for folks.

@pjbull pjbull removed the review label Jan 23, 2022
@amardeep
Copy link
Author

Thanks a lot for the quick response :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants