Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/range bytes #2890

Closed
wants to merge 6 commits into from
Closed

Conversation

quasiben
Copy link

PR attempts to resolve #2888. Users should be able to easily download a range of bytes when working with storage blobs. This PR tries to minimally inject a new kwarg: range_bytes in download_as_string and related downloading functions:

# download the first 100 bytes
blob.download_as_string(range_bytes=(0,100))

# download the second 100 bytes
blob.download_as_string(range_bytes=(99,200))

cc @daspecster

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please let us know the company's name.

@googlebot googlebot added the cla: no This human has *not* signed the Contributor License Agreement. label Dec 20, 2016
@daspecster
Copy link
Contributor

@quasiben awesome! Thanks!

I'll start a review on this right now.

Copy link
Contributor

@daspecster daspecster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good!

@@ -370,18 +370,30 @@ def initialize_download(self, http_request, http):

:type http: :class:`httplib2.Http` (or workalike)
:param http: Http instance for this request.

:type range_bytes: tuple
:param range_bytes: Optional. Range of bytes to download.

This comment was marked as spam.

self._set_range_header(http_request, 0, end_byte)
if range_bytes:
start_byte, end_byte = range_bytes
end_byte = end_byte - 1

This comment was marked as spam.

This comment was marked as spam.

response = make_api_request(
self.bytes_http or http, http_request)
if response.status_code not in self._ACCEPTABLE_STATUSES:
raise HttpError.from_response(response)
self._initial_response = response
self._set_total(response.info)
# when using `range_bytes` we need to reset total_size

This comment was marked as spam.

@@ -319,6 +319,9 @@ def download_to_file(self, file_obj, client=None):
:param client: Optional. The client to use. If not passed, falls back
to the ``client`` stored on the blob's bucket.

:type range_bytes: tuple
:param range_bytes: Optional. Range of bytes to download.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

"""Download the contents of this blob as a string.

:type client: :class:`~google.cloud.storage.client.Client` or
``NoneType``
:param client: Optional. The client to use. If not passed, falls back
to the ``client`` stored on the blob's bucket.

:type range_bytes: tuple
:param range_bytes: Optional. Range of bytes to download.

This comment was marked as spam.

@daspecster
Copy link
Contributor

Aside from my little nitpicks, if you want to know how to run the tests locally, I recommend running a few things and making sure they pass before pushing up.

tox -e py27 -- core storage
tox -e py35 -- core storage
tox -e cover -- core storage
tox -e lint

A guide to setting up tox and running these can be found in our CONTRIBUTING.rst.

@quasiben
Copy link
Author

Thanks for the review @daspecster

@quasiben
Copy link
Author

I'll have to get another setup to properly use tox. Currently, my main Python environments are conda based

@quasiben quasiben force-pushed the feature/range_bytes branch from 09002c4 to d2b790b Compare December 20, 2016 22:48
Copy link
Contributor

@dhermes dhermes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I get deeper in this review, I want to point something out.

As part of "divorcing" httplib2 we (@jonparrott and I) plan on greatly re-factoring google.cloud.streaming.

:param range_bytes: (Optional). Range of bytes to download.

For more info on Range Bytes:
https://cloud.google.com/storage/docs/xml-api/reference-headers#range

This comment was marked as spam.

"""
self._ensure_uninitialized()
url = http_request.url
if self.auto_transfer:
end_byte = self._compute_end_byte(0)
self._set_range_header(http_request, 0, end_byte)
if range_bytes:

This comment was marked as spam.

if range_bytes:
start_byte, end_byte = range_bytes
start_byte = int(start_byte)
end_byte = int(end_byte)

This comment was marked as spam.

# when using `range_bytes` we need to reset total_size,
# otherwise we get the total of the file and the entire contents
# are downloaded
if range_bytes: self._total_size = abs(end_byte - start_byte)

This comment was marked as spam.

This comment was marked as spam.

@quasiben
Copy link
Author

@dhermes happy to wait on this work if the "divorce" is imminent

@quasiben
Copy link
Author

I signed it!

@googlebot
Copy link

CLAs look good, thanks!

@googlebot googlebot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Dec 21, 2016
@quasiben
Copy link
Author

anything else that needs doing here?

@daspecster
Copy link
Contributor

@quasiben your changes that I requested look good!

I think @dhermes is going to look into this more based on his comment so let's see what suggestions he has before merging.

@quasiben
Copy link
Author

Sounds good

@daspecster daspecster added api: core api: storage Issues related to the Cloud Storage API. labels Dec 28, 2016
@daspecster
Copy link
Contributor

@dhermes did you have any other feedback for this?

@daspecster
Copy link
Contributor

@dhermes just a reminder about this.

@daspecster
Copy link
Contributor

image

@daspecster daspecster requested a review from tseaver January 30, 2017 21:20
@daspecster
Copy link
Contributor

Thank you so much for this @quasiben! However after talking to @dhermes, it sounds like a better path may be to work towards #1998 and remove httplib2.

I'm going to close this for now and try and work towards #1998.

@daspecster daspecster closed this Jan 30, 2017
@quasiben
Copy link
Author

Ok, any ideas on a timeline for #1998 ?

@daspecster
Copy link
Contributor

@dhermes has a plan in mind I think.

@dhermes is there anyway that I can help with this after this week(due to the other stuff we have to get done)?

@danoscarmike danoscarmike mentioned this pull request Aug 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: core api: storage Issues related to the Cloud Storage API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How To Set Range
4 participants