Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(rw2.0): reject remote write 2.0 based on content type #10423

Merged
merged 6 commits into from
Jan 15, 2025

Conversation

krajorama
Copy link
Contributor

@krajorama krajorama commented Jan 13, 2025

What this PR does

Reject remote write 2.0 request properly with HTTP status 415, based on RW2.0 spec.

The current solution returns 2xx , but doesn't actually ingest the samples.

Prometheus does detect this
prometheus-1 | time=2025-01-13T13:01:35.028Z level=ERROR source=queue_manager.go:1670 msg="non-recoverable error" component=remote remote_name=150c10
url=http://mimir-1:8001/api/v1/push failedSampleCount=2000
failedHistogramCount=0 failedExemplarCount=0
err="sent v2 request with 2000 samples, 0 histograms and 0 exemplars;
got 2xx, but PRW 2.0 response header statistics indicate 0 samples,
0 histograms and 0 exemplars were accepted; assumining failure e.g.
the target only supports PRW 1.0 prometheus.WriteRequest, but does not
check the Content-Type header correctly"

But we can do better and also start working towards RW2.0 support.

Which issue(s) this PR fixes or relates to

Related to #9072
Copied and dumbed down integration test from #10432

Checklist

  • [N/A] Tests updated. Tested manually, we are going to replace this with proper support.
  • [N/A] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • [N/A] about-versioning.md updated with experimental features.

@krajorama
Copy link
Contributor Author

With this patch, Prometheus gets:

prometheus-1  | time=2025-01-13T13:06:03.430Z level=ERROR source=queue_manager.go:1670
 msg="non-recoverable error" component=remote remote_name=150c10 url=http://mimir-1:8001/api/v1/push failedSampleCount=2000 failedHistogramCount=0 failedExemplarCount=0
err="server returned HTTP status 415 Unsupported Media Type: remote-write v2 is not supported\n"

The current solution returns 2xx , but doesn't actually ingest the samples.

Prometheus does detect this
prometheus-1  | time=2025-01-13T13:01:35.028Z level=ERROR
source=queue_manager.go:1670 msg="non-recoverable error"
component=remote remote_name=150c10
 url=http://mimir-1:8001/api/v1/push failedSampleCount=2000
failedHistogramCount=0 failedExemplarCount=0
err="sent v2 request with 2000 samples, 0 histograms and 0 exemplars;
 got 2xx, but PRW 2.0 response header statistics indicate 0 samples,
 0 histograms and 0 exemplars were accepted; assumining failure e.g.
 the target only supports PRW 1.0 prometheus.WriteRequest, but does not
 check the Content-Type header correctly"

But we can do better and also start working towards RW2.0 support.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@krajorama krajorama force-pushed the feat/remote-write-two branch from 70ef976 to 5dd0ff1 Compare January 13, 2025 13:13
@krajorama krajorama marked this pull request as ready for review January 13, 2025 13:13
@krajorama krajorama requested a review from a team as a code owner January 13, 2025 13:13
Copy link
Member

@jesusvazquez jesusvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except there could maybe be a test?

From #10432

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo possible nits. Thanks for adding the test!

integration/distributor_test.go Outdated Show resolved Hide resolved
integration/distributor_test.go Outdated Show resolved Hide resolved
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@krajorama krajorama enabled auto-merge (squash) January 15, 2025 15:54
@krajorama krajorama merged commit 1d7978e into main Jan 15, 2025
29 checks passed
@krajorama krajorama deleted the feat/remote-write-two branch January 15, 2025 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants