-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support batched queries (fix #1812) #3490
Conversation
Deploy preview for hasura-docs ready! Built with commit 475d902 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, but we should probably update the docs as well. At the very least, we should update the API documentation for /v1/graphql
, but I’m not sure if a mention elsewhere makes sense. @marionschleifer, do you have any thoughts?
Review app for commit 2c71c7d deployed to Heroku: https://hge-ci-pull-3490.herokuapp.com |
I have a question. I think the convention around executing multiple queries is to gather all errors during execution (if there are any) and then return a response with the data and errors. The spec doesn't talk about this behaviour explicitly AFAICT, but the sub-sections in response section has examples which looks like that. Particularly, example 185. The spec also mentions, if there are validation errors then the server can reject the entire query. I think most libraries follow the convention of executing all queries even if previous one had execution errors, and returning data and errors together. We can also probably verify this with something like apollo-server (I think they support batch queries). With the current implementation, it looks like if there's an execution error we get back only error and E.g - So the question is, shouldn't we follow the convention? Ref - |
Since the spec doesn't mention batching, I think it'd be reasonable to treat batched requests as if each individual request were run in order. If the first request fails, I think it's fair to not run the second. Especially in the case where the first request is a mutation, it probably doesn't make sense to carry on to the second request. If the second is a query, it might return incorrect data, and if it is a mutation, its effects might depend on the effects of the first being successful. Now it might be more reasonable to run batches of queries in parallel, and return errors in parallel too, allowing one but not all to succeed, but I think there's a case to be made that it's better to keep the semantics uniform across queries and mutations. Also, in the case of remote schemas, I'm not sure to what extend we can expect queries to be entirely non-side-effecting. The example you linked suggests parallel semantics for fields within a single query, but I think that's fine because a single query can't include mutations inside it, whereas a batch of requests can mix queries and mutations. |
Review app for commit 9fbbddd deployed to Heroku: https://hge-ci-pull-3490.herokuapp.com |
@@ -28,9 +28,60 @@ The following types of requests can be made using the GraphQL API: | |||
- :doc:`Query / Subscription <query>` | |||
- :doc:`Mutation <mutation>` | |||
|
|||
Batching requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by the title Batching requests
and then further down the sentence ...we can send two queries in one request
. Is it multiple queries in a request? Or multiple requests in a request? 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the example is two queries, but it's not restricted to just queries because you can mix queries and mutations. So, I need a word that covers both of those - perhaps "request" isn't ideal since there is also "the request", but just saying "query" would be incomplete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched it to "operations" since that's the word the GraphQL spec uses.
Review app for commit 6ec2b9f deployed to Heroku: https://hge-ci-pull-3490.herokuapp.com |
Review app for commit 475d902 deployed to Heroku: https://hge-ci-pull-3490.herokuapp.com |
Review app https://hge-ci-pull-3490.herokuapp.com is deleted |
Ooo I very much disagree with this and this is going against the conventions established by other graphql servers. Apollo allows automatic batching of queries made within a certain timeframe--e.g. batch all queries made within 10 milliseconds of each other. If Hasura implements batching as described above, this will mean that unrelated queries in an app may fail depending on when they are executed--very unexpected and very undesirable!
If someone wants queries/mutations to be executed as part of the same transaction (or if they want the order to matter), they should send them as part of a single operation (easy to do with aliases). Batched queries (and mutations) should always be handled in parallel--I know this is the way graphql-ruby handles them, I also just verified that this is the way apollo-server handles them. |
@thefliik I wasn’t really familiar with the way Apollo’s batching works, but I just read up on it a little bit. It seems like the purpose of batching is really just to reduce the number of concurrent in-flight HTTP requests… is that accurate? If it is, then I admit I’m a little skeptical of its value. It seems like there are already two better solutions to that problem:
In any case, I agree that this PR is inconsistent with the way the Apollo client evidently expects batching to work, so I think we should probably back this change out or change the way it works (as right now it implements a protocol no client is likely to implement). But the above points make me wonder if it’s worth supporting at all. Is there any reason you think that HTTP/2 support wouldn’t be enough to subsume this mechanism? |
I'm also unclear on the benefits of batching vs a transport layer solution, but after some thought I've convinced myself that if we support batching at all, it should have the semantics you describe. While it's possible to send some nonsensical requests, it's no worse than what you could do with non-serialized requests without batching (e.g. multiple threads of execution in JS sending requests independently), and I think it would be fine for the semantics to emulate that behavior. That is, we pretend we received N independent requests in order. I was previously concerned because for some reason I thought the different operations could come from the same thread of execution, and therefore could depend on each other. However, the only way I think you could get into this situation is something like this: client.mutate({
...
}).then(...);
client.mutate({
...
}).then(...); or client.mutate({
...
}).then(x => client.mutate({
...
}).then(...)); In the first case, there is either no way for them to depend on each other's results, and in the second, the two requests would never occur in the same batch due to data dependencies. |
@lexi-lambda @paf31 I believe you are correct. For reference, you can read this apollo blog post about batching that specifically cites http/2 as an alternative to batching: https://blog.apollographql.com/batching-client-graphql-queries-a685f5bcd41b.
I'm speculating, but I think batching exists for folks who want to use plain old http (for whatever reason). It's also important to know that batching was introduced/invented by Apollo back in 2016, well before the graphql spec included subscriptions and, I think, well before most folks were using graphql over websockets. I just did a cursory check, and it looks like a draft of the http/2 spec was introduced in 2015, so it's also unlikely anyone was using that when batching was introduced (I think I read somewhere that it wasn't until Node 10--released in 2018--that http/2 support was included, so I'll also point out that Google Firebase Functions don't even officially support Node 10 yet, support is currently in beta). Anyway, I'm not familiar with the http/2 spec, but, from my perspective, there still might be cause to support ApolloClient batching if there were reasons why Hasura users couldn't use http/2 or websockets. In general, I could imagine websockets might be problematic for someone if they needed a stateless connection. For example, if the Hasura server was called like a serverless function (using something like Google Cloud Run) then websockets wouldn't be an option (not sure about http/2). |
Description
This adds the
GQLBatchedReqs
type which supports multiple batched queries like the one described in the original linked example, as well as single queries, just like before.It's unclear what to do with forwarded headers in the case where we have multiple queries, so to keep things simple for now, batched queries will not forward any headers. We can revisit this later if necessary.
Affected components
Related Issues
#1812
Catalog upgrade
Does this PR change Hasura Catalog version?
Metadata
Does this PR add a new Metadata feature?
run_sql
auto manages the new metadata through schema diffing?run_sql
auto manages the definitions of metadata on renaming?export_metadata
/replace_metadata
supports the new metadata added?GraphQL
Breaking changes
No Breaking changes
There are breaking changes:
Metadata API
Existing
query
types:args
payload which is not backward compatibleJSON
schemaGraphQL API
Schema Generation:
NamedType
Schema Resolve:-
null
value for any input fieldsLogging
JSON
schema has changedtype
names have changedSteps to test and verify
I've tested this on the command line as follows:
I haven't tested this with Apollo itself yet.Edit: I was able to test batching from Apollo with this small Node script:Limitations, known bugs & workarounds