-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datastore query limits are silently truncated by the v1beta3 API #1763
Comments
@Bogdanp This is a "feature" not a bug. The API doesn't promise to return your fetch limit, it just promises not to exceed it. It typically times out on the backend after a pre-set amount of time. Is the @pcostell Can say more, though I am pre-emptively closing this since there isn't anything we can do in the library. |
The library does exceed it in the second example I gave though: requesting 500 entities will end up returning more than the limit if there exist more entities than that limit. |
Oh wow, my bad! Good thing I forgot to click the close button. Reproducing now on my end. |
No worries! Regarding the "feature", is it possible to get that bumped up somehow or is that set in stone? The old API used to allow much higher values for limits. |
The issue is with the way iter works, btw: it naively calls |
OK this is totally a "bug" in our implementation, essentially identical to #1467. It occurs because As for 300 vs. 500, that's completely out of |
Jinx! |
The Cloud Datastore API has to reserve the right not to return all the results -- we need to protect both our servers and other users on them. This is why we expose our API with support for batching -- but the client libraries need to take advantage of this in order to expose a nicer API to our users. I think
|
Thanks @pcostell. |
@dhermes can we get this bug assigned to someone? The Cloud Datastore batch size shouldn't be something client library users should have to worry about. |
In the process, also keeping tracking of the number of skipped results (so that we can update the offset). Fixes googleapis#1763.
Hey. |
@Rafff How do you mean? The API has limits based on payload size and there isn't anything we can do about it in the client. However, we do provide an iterator so that you don't have to worry about it. |
Hi @dhermes , Hint: all my entities match the query |
@ibrahim-abuelalaa Would you mind sharing some code so we can try to reproduce? |
client = datastore.Client()
query = client.query(kind = kind)
query.add_filter('projectKey', '=', projectKey)
len(list( query.fetch(limit=limit, offset=offset))) |
So you have something like: >>> query_iter1 = query.fetch()
>>> len(list(query_iter1))
8001
>>>
>>> query_iter2 = query.fetch(offset=351, limit=2000)
>>> len(list(query_iter2))
1899
>>>
>>> query_iter3 = query.fetch(offset=1000, limit=2000)
>>> len(list(query_iter3))
0 |
@ibrahim-abuelalaa Is the distinction relevant? (I.e. is "almost" good enough?) I just want to be able to reproduce this issue. |
@dhermes yes it is exactly like that, sorry for confusion |
Hello. Is there any way to work around the 300 entities limit? Is there any documentation about this 300 limit? I searched the library source code and nothing related to this 300 limit. Is it from the server side? To iterating all the entities, what is the best practice? Should we set the limit to |
This bug manifests itself in a couple of "fun" ways:
Will yield
300
, assuming there are300
or more entities of that kind.Assuming there are
500
entities of that kind, will yield500
. If there aren
entities wheren > 500
, it will yieldn
.This all seems to be caused by the
v1beta3
api truncating the limits to300
:len(entities)
will be300
and not500
even if there are500
or more entities of that kind.We're currently using the following snippet to work around this issue:
The text was updated successfully, but these errors were encountered: