Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Receive: add per request limits for remote write #5527
Receive: add per request limits for remote write #5527
Changes from 21 commits
0dbf4ec
9ac2340
7995468
87cef59
6cd031c
83ab7ca
a6addd3
ab7fd37
e836893
3240f69
1f38552
77a404b
d00ea15
2400aac
510248a
b8943a0
7f5c41b
7071c22
2cd1014
8561400
7efee5e
42d9f49
a98bcc3
cc83217
a9b5529
58d362b
003d9aa
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of extending the existing handler, is it possible to make the limiter a middleware that wraps the handler? This could be cleaner to maintain long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't think that's a good idea, because then the limiter middleware will have to do work that is done again in the main handler:
I wanted to avoid doing extra work at all costs to keep the "hot path" of Receive fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that's a good point. Well initially I thought the middleware would be on this handler which already has the decoded protobuf request, but here we don't have the raw body correct?
Also if I may ask, what is the practical use of setting limits on the request body? As an admin, I am not sure I would know what to set this limit to. I can understand the impact of limiting series or samples, but I would find it hard to know how to configure the body size limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. When it gets to that handler (it does only forward request across the hashring) almost all the "heavy" work is done and some of it will be wasted. Examples:
The request size limit is more geared towards networking concerns, I would say. It prevents clients from writing very big requests (even when the other limits aren't configured or hit, i.e. very long label name/values, or when clients are writing "junk"). This can still cause OOMs or possibly create a situation similar to a slow client attack.
This makes me wonder why Receive's remote write endpoint doesn't have a configurable server-side timeout... possibly could add this in a follow up too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an e2e test could help in understanding some real-world client behavior eg Prometheus, avalanche. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saswatamcode actually, what do you mean with
understanding some real-world client behavior eg Prometheus, avalanche.
?Do you mean, for example, testing that a well configured Prometheus or Avalanche will retry on a 429? I don't think it's worth to add an e2e test only for this reason. It means we're testing Prometheus/Avalanche and not Thanos.
I started working on the e2e tests and the further I progress on them the more I think they have no real value beyond what the handler test provides. The only additional value the e2e adds that I can see is to ensure that the CLI args are properly passed down to the request limiting code. Besides this, the request limiting code can be tested without having to add an e2e test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant if seeing how a hashring would behave, with such request limits and multiple tenants, would be useful?
But yes let's see what maintainers have to say! Unit tests might be sufficient here. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Less + simpler tests while having the same verifications level = better, so I think a unit tests will suffice here.
In some way yes, but also... are we sure those projects have those tests? 🙃
Anyway, let's investigate/create focus tests in separate stream of work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this mutation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to kind of have 1 as default value for the test scenarios that do not explicitly set
amountSamples
.