Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PROPOSAL: Storage API consistency fix #632

Closed
3 of 8 tasks
dhermes opened this issue Feb 13, 2015 · 11 comments
Closed
3 of 8 tasks

PROPOSAL: Storage API consistency fix #632

dhermes opened this issue Feb 13, 2015 · 11 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API.

Comments

@dhermes
Copy link
Contributor

dhermes commented Feb 13, 2015

Guiding principles:

  • Getters and setters should never make HTTP requests. Lazy loading is OK,
    but only when it involves instance creation / other local (non-network bound) behavior. For example, in Bucket.acl this already happens:
@property
def acl(self):
    """Create our ACL on demand."""
    if self._acl is None:
        self._acl = BucketACL(self)
    return self._acl
  • More generally HTTP requests should be limited to explicit method calls. This also rules out constructors loading data.
  • Blob, Bucket, and *ACL (the only nouns) instances should have load(), exists(), create(), delete(), and update() methods. This design gives rise to code like
blob = Blob('/remote/path.txt', bucket=bucket, properties=properties)
try:
    blob.load()  # Just metadata
except NotFound:
    blob.upload_from_file(filename)  # Sends metadata from properties

(this maybe screams for get_or_create(), we'll feel it out as we develop). It's unclear if it's worth making a distinction between storage.NOUN.update <--> PUT and storage.NOUN.patch <--> PATCH. (As of right now, we don't implement PUT / update anywhere.)

  • exists() should use fields in the requests to minimize the payload.
  • A Connection should not be required to be bound to any object (one of the nouns Bucket, ACL, or Blob) but should be an optional argument to methods which actually talk to the API.
  • We should strongly consider adding a last_updated field to classes to indicate the last time the values were updated (from the server).
  • For list methods: List all buckets top-level, e.g.
storage.get_all_buckets(connection=optional_connection)

and then bucket.get_all_objects(). It's unclear how the other 3 nouns (objectAccessControls, bucketAccessControls and defaultObjectAccessControls) will handle this. Right now they are handled via ObjectACL.reload() and BucketACL.reload() (a superclass of DefaultObjectACL).

  • Implicit behavior (default project, default bucket and/or default connection) should be used wherever possible (and documented)

@tseaver Please weigh in. This was inspired by our discussion at the end of #604.

/cc @thobrla I'd like to close #545 and focus on this (possibly broken down into sub-bugs). Does that sound OK?

@thobrla
Copy link

thobrla commented Feb 13, 2015

Closing #545 and focusing on this+sub-bugs sounds good to me; this captures the concerns I had there.

@dhermes
Copy link
Contributor Author

dhermes commented Feb 13, 2015

Thanks!

@dhermes
Copy link
Contributor Author

dhermes commented Feb 14, 2015

Aside: I made a list of all the 34 API methods and the corresponding code paths.

@dhermes dhermes changed the title PROPOSAL: Storage API consistency fix (545) PROPOSAL: Storage API consistency fix Feb 15, 2015
@dhermes
Copy link
Contributor Author

dhermes commented Feb 15, 2015

@tseaver I'd like to get moving on this soon. WDYT of the proposals?

@tseaver
Copy link
Contributor

tseaver commented Feb 15, 2015

@dhermes I'm in "quick hit" mode today, but can review in more depth tomorrow.

@dhermes
Copy link
Contributor Author

dhermes commented Feb 15, 2015

I'm in Portland mostly AFK, so next few days is fine.

@dhermes
Copy link
Contributor Author

dhermes commented Feb 24, 2015

@tseaver Can we move forward on this?

@tseaver
Copy link
Contributor

tseaver commented Feb 24, 2015

I'm not sure debate here is the right thing: this is a large set of changes, and I'm not sure what the goals are: we should be talking about them, first, before we try to plan out an implementation.

@dhermes
Copy link
Contributor Author

dhermes commented Feb 24, 2015

The goal is really just making the API more usable.

The current behavior of "sync" whenever a property is added outside a batch shouldn't be the default behavior (see #545). Most of the changes above relate to making network interaction occur more transparently.

@jgrahn
Copy link

jgrahn commented Jun 21, 2016

Hi. Is there any progress on the part that says blob.upload_from_file(filename) # Sends metadata from properties in this issue (or is there a separate issue for it I haven't found)?

Having to use patch() after upload_...() unfortunately has several more or less serious drawbacks:

  1. It requires the client to have credentials allowing PATCH requests, preventing immutable append-only semantics.
  2. It is not atomic, so if the PATCH operation fails, or the client fails between the calls, the data in the storage service might be left in an inconsistent state.
  3. It lacks consistency. There will be a window when the blob data has been uploaded but when metadata are still incorrect or missing, leading to potential race conditions.
  4. The metadata-generation will never be 1.

These are currently blocking me from using gcloud-python.

User @pdknsk seems to have provided a patch in #536. I think it would make sense implementing that even before/without having the "load(), exists(), create(), delete(), and update()" interface described in this issue.

I'd be happy to provide a pull request if that helps.

@pdknsk
Copy link

pdknsk commented Jun 21, 2016 via email

vchudnov-g pushed a commit that referenced this issue Sep 20, 2023
* docs: Fix formatting of request arg in docstring

chore: Update gapic-generator-python to v1.9.1
PiperOrigin-RevId: 518604533

Source-Link: googleapis/googleapis@8a085ae

Source-Link: googleapis/googleapis-gen@b2ab4b0
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYjJhYjRiMGEwYWUyOTA3ZTgxMmMyMDkxOThhNzRlMDg5OGFmY2IwNCJ9

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

---------

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API.
Projects
None yet
Development

No branches or pull requests

6 participants