Skip to content
This repository has been archived by the owner on Mar 20, 2018. It is now read-only.

Change page-streaming behavior #94

Merged
merged 3 commits into from
Apr 29, 2016
Merged

Change page-streaming behavior #94

merged 3 commits into from
Apr 29, 2016

Conversation

geigerj
Copy link
Contributor

@geigerj geigerj commented Apr 26, 2016

Fixes #86

Now return an iterable over pages of the response resource field,
rather than the return type, for page stream-able calls where page
streaming is disabled.

Note that this removes the ability to get a standard gRPC/proto
response from any call that is capable of page streaming.

This is intended to demonstrate one of the proposals from #86

Now return an iterable over pages of the response resource field,
rather than the return type, for page stream-able calls where page
streaming is disabled.

Note that this removes the ability to get a standard gRPC/proto
response from any call that is capable of page streaming.
@geigerj
Copy link
Contributor Author

geigerj commented Apr 26, 2016

@geigerj
Copy link
Contributor Author

geigerj commented Apr 26, 2016

My primary concern with this approach is that it is no longer possible to access any fields in the response type beside the repeated resources field. What if an API wants to send a response type that looks like, e.g.

message ListFooResponse {
  repeated Foo foos = 1;
  int next_page_token = 2;
  // Reports Foos that could not be read, and the errors that occurred when trying to read them
  repeated Error errors = 3;
}

or

message ListFooResponse {
  repeated Foo foos = 1;
  int next_page_token = 2;
  // The number of foos in this page
  int foo_count = 3;
}

?

@tbetbetbe
Copy link
Contributor

My primary concern with this approach is that it is no longer possible to access any fields in the response type beside the repeated resources field.

GAX provides support for specific API patterns. There is a doc (currently internal) that describes the API patterns that GAX aims to support. IUC, the cases you've described contravene the rules it lays out for page-streaming calls.

@geigerj
Copy link
Contributor Author

geigerj commented Apr 26, 2016

Case 2 actually comes from the document you referenced (foo_count actually should be the total count of foos, across all pages). There are examples of the first case internally as well, although they may be nonstandard, as you suggested.

@geigerj
Copy link
Contributor Author

geigerj commented Apr 26, 2016

It also might be worth taking a look at googleapis/google-cloud-python#895, which discusses a similar feature.

@geigerj
Copy link
Contributor Author

geigerj commented Apr 26, 2016

See also the "list" examples in googleapis/google-cloud-python#1722

@jmuk
Copy link
Contributor

jmuk commented Apr 26, 2016

How about yielding a custom class instead of the result of getattr()?
Then holding the response message in the class and implement iter() for iterating over the resources.

@geigerj
Copy link
Contributor Author

geigerj commented Apr 26, 2016

That seems possible and is essentially what gcloud team do with gcloud.iterator.

@tbetbetbe
Copy link
Contributor

How about yielding a custom class instead of the result of getattr()?
Then holding the response message in the class and implement iter() for iterating over the resources.

That seems possible and is essentially what gcloud team do with gcloud.iterator.

Can you update the PR to try this ?

@tseaver
Copy link

tseaver commented Apr 27, 2016

If I disable is_page_streaming, and therefore get the actual response back with its next_page_token, how does gax expect me to pass the token back in for a subsequent response?

@geigerj
Copy link
Contributor Author

geigerj commented Apr 28, 2016

You're right -- you can't currently pass a page token through the wrapper list_resource methods in the generated code.

I would suggest we replace the is_page_streaming attribute in CallOptions by a page_token attribute, which, if set, disables page streaming.

@tbetbetbe
Copy link
Contributor

LGTM

Can you update the PR to reflect this.

On 28 April 2016 at 09:12, geigerj notifications@github.com wrote:

You're right -- you can't currently pass a page token through the wrapper
list_resource methods in the generated code.

I would suggest we replace the is_page_streaming attribute in CallOptions
https://github.com/googleapis/gax-python/blob/master/google/gax/__init__.py#L116
by a page_token attribute, which, if set, disables page streaming.


You are receiving this because you were assigned.
Reply to this email directly or view it on GitHub
#94 (comment)

This email may be confidential or privileged. If you received this
communication by mistake, please don't forward it to anyone else (it may
contain confidential or privileged information), please erase all copies of
it, including all attachments, and please let the sender know it went to
the wrong person. Thanks.

@tseaver
Copy link

tseaver commented Apr 28, 2016

@geigerj How would I make the initial non-streaming request (when I do not yet have a token)?

@geigerj
Copy link
Contributor Author

geigerj commented Apr 28, 2016

One option would be to set page_token to 0, which corresponds to the value in the request proto.

@tbetbetbe
Copy link
Contributor

One option would be to set page_token to 0, which corresponds to the value in the request proto.

as a refinement, make an appropriately named public constant in the GAX library with the value 0, e,g INITIAL_PAGE and use that.

@tseaver
Copy link

tseaver commented Apr 29, 2016

Hmmm, I was under the impression that page tokens are opaque strings, rather than integers (my guess is that they capture some kind of "generational" state for the request). Is 0 a legitimate value?

@geigerj
Copy link
Contributor Author

geigerj commented Apr 29, 2016

In that case, an empty string, rather than 0. For the first page, you'd want to pass in the proto3 default value for that field.

Now, if page streaming is set to per-page (rather than per-resource), it
is possible to specify a page token to the request. It is also possible
to retrieve the full response message, as well as the next page token,
from the per-page iterator.
@geigerj
Copy link
Contributor Author

geigerj commented Apr 29, 2016

Updated. In the new version,

  • To page stream over resources, the behavior is the same as before this PR.
  • To page stream over pages starting from the first page, you set the page_token of CallOptions to INITIAL_PAGE. This returns a PageIterator object that can iterate over pages of resources, but also provides access to the page_token and the full response message.
  • To page stream over pages starting from an arbitrary page, you obtain a page_token from the PageIterator object, and pass that into set CallOptions.

@jmuk
Copy link
Contributor

jmuk commented Apr 29, 2016

Sorry I haven't caught up the discussion here, but I don't think there is rules for the type of the page tokens. I believe they are almost always string, but theoretically some API can publish methods with numeric page tokens or messages.

Why not adopting the same approach as OPTION_INHERIT? i.e.

  • define INITIAL_PAGE with object()
  • check if the current page token equals to INITIAL_PAGE exactly, and do not set the page in that case.

Previously, assumed page_token was a string. Now it can have any type.
@geigerj
Copy link
Contributor Author

geigerj commented Apr 29, 2016

Updated.

Good point; if we don't need to, it's better not to tie the page token to a particular type.

@codecov-io
Copy link

Current coverage is 97.02%

Merging #94 into master will increase coverage by +0.17%

@@             master        #94   diff @@
==========================================
  Files             8          8          
  Lines           539        571    +32   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            522        554    +32   
  Misses           17         17          
  Partials          0          0          

Powered by Codecov. Last updated by cfed384...7123ce6

@jmuk
Copy link
Contributor

jmuk commented Apr 29, 2016

looks good to me

@tbetbetbe
Copy link
Contributor

LGTM, but bump the version please. I think this probably corresponds to a new minor version.

Can you also

  • add a tag for the version as well
  • and once you've merged the PR, push the tag to upstream.

@geigerj
Copy link
Contributor Author

geigerj commented Apr 29, 2016

The version was updated in the first commit already. Will push the tags following the merge.

@geigerj
Copy link
Contributor Author

geigerj commented Apr 29, 2016

Fixes #86

@geigerj geigerj merged commit 6bab6a3 into googleapis:master Apr 29, 2016
@tseaver
Copy link

tseaver commented May 3, 2016

Thanks for implementing this feature. When do you expect to make a PyPI release which includes it?

@geigerj
Copy link
Contributor Author

geigerj commented May 4, 2016

@tbetbetbe

I can push this to PyPI now. Tim, is there any reason I should hold off on the push, for example, to coordinate with a codegen refresh? (I may not have time to refresh codegen until a bit later, but I believe the vkit packages are pegged to the old version of GAX, so I don't think pushing this to PyPI should affect them.)

@tseaver
Copy link

tseaver commented May 16, 2016

I know this is merged already, and am looking at how gcloud.pubsub will use it. I don't see a way to pass through a custom page_size (as found in pubsub.topics.ListTopicsRequest et aliae.

@geigerj
Copy link
Contributor Author

geigerj commented May 16, 2016

Ah, yes, you're right. I've just filed an issue to fix this in the code generator: googleapis/gapic-generator#110. With this new set up, we should allow page_size to be passed as an optional parameter to page streaming calls that support it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants