-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unwarranted large QPcache sizes because of not hashing the operation name for queryPlanStoreKey #2309
Comments
Created a PR: |
Thanks for raising this issue and submitting a pull request! Our team will review and provide feedback shortly. |
Hey @AneethAnand, can you help me understand the problem a bit better or provide a reproduction of the issue you're seeing? Based on your explanation I feel like I'm missing something. Here's my understanding: TL;DR, the only effect I see in this PR is changing the cache key from one unique ID to a slightly different one. For simplicity I'll assume your documents contain only one operation. This is how I imagine the scenario you've described.
For a long-running gateway, this results in 1 cache miss and 2 cache entries for the same query plan. But the next time (2) happens you shouldn't see another new cache entry or a cache miss. And once your gateway is restarted or your cache starts evicting old plans I wouldn't expect duplicate cache entries. All of what I've said above would be expected whether or not the key included the
I don't understand how this could be a side effect of the cache key we use. Are there possibly some other side effects of including |
Hi @trevor-scheer, Thank you for the detailed explanation and the scenario you've described. BackgroundMost of our input query has only one operation, but in few test scenarios we add multiple operations. I understand adding the operation name to the key mostly helps solve multiple operations per document scenarios. Once we added operation name to all the request along with the query we saw a sharp increase in the time taken for gateway.plan consistently. This could happen only if there were cache misses all the time. So I was also thinking to increase the cache size through config. But the real issue is that adding operation name increases the cache size as the operation name is not SHAed in the cache key. The moment we reverted the PR that includes request.operation name, we saw the numbers back to normal. We will also keep track of cache misses and hit and that's the reason I created the other PR to add cache config to the QueryPlannerConfig. Overall, by hashing both the query and operation name to generate the cache key,
The proposed change to hash both the query and operation name is intended to address these issues Thank you again for your insights and guidance in resolving this issue! |
Ok, so what you're suggesting is this:
Thanks for the explanation, I've got a much better understanding of the problem now. I'll have a closer look at your PR. Just out of curiosity, how long are your operation names? Seems like typical names wouldn't be such a major contributor to the size of a cache entry unless they're massive. The size of the key would definitely pale in comparison to the size of a query plan (generally speaking). |
The average length of the operation name is 30-40 characters and there are 2500 unique operations running in the instances. Therefore, including the operation name in the request would increase the cache size by 87,500 bytes (2500 * 35). |
Description
We are using the Federation Gateway in production and recently added the operation name to the request along with the query. After the change, we noticed an increase in the time taken to generate the query plan and all of the cache misses. Our analysis showed that the issue was that the operation name is not hashed for the query plan store key.
https://github.com/apollographql/federation/blob/main/gateway-js/src/index.ts#L758
Impact
The increased cache size taken to store the query plan and cache misses are impacting the performance of our application.
Possible Solution
We could hash the operation name along with the query hash to properly incorporate the operation name into the key.
By hashing both the query and operation name saves more space/ cache size.
For example:
This would generate a unique hash for the query and operation name, which could be used as the query plan store key.
The text was updated successfully, but these errors were encountered: