Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not query blocks store for time ranges already covered by ingesters #2642

Conversation

pracucci
Copy link
Contributor

@pracucci pracucci commented May 28, 2020

What this PR does:
When -querier.query-store-after is configured and running with the blocks storage, we should be able to safely manipulate the query maxt for series fetched from the store because more recent samples are 100% fetched from ingesters only.

This is a quite important optimisation for the blocks storage where we can afford to query ingesters only for the last few hours and thus we could skip querying the most recent non-compacted-yet blocks from the storage.

We tested this change in a large cluster (60 ingesters) running some slow queries keeping any caching disabled (for testing), and we got the following results:

Note to reviewers:

  • I also took the opportunity to improve a bit the related tracing spans

Which issue(s) this PR fixes:
N/A

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@pracucci pracucci requested review from pstibrany and gouthamve May 28, 2020 07:11
@pracucci pracucci force-pushed the do-not-query-store-for-time-ranges-already-covered-by-ingesters branch from 15fc37c to 4c47d7f Compare May 28, 2020 07:19
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@pracucci pracucci marked this pull request as draft May 28, 2020 16:28
@pracucci pracucci force-pushed the do-not-query-store-for-time-ranges-already-covered-by-ingesters branch from aa93f40 to 265d2e0 Compare May 29, 2020 06:33
Signed-off-by: Marco Pracucci <marco@pracucci.com>
@pracucci pracucci force-pushed the do-not-query-store-for-time-ranges-already-covered-by-ingesters branch from 265d2e0 to c733124 Compare May 29, 2020 07:34
@pull-request-size pull-request-size bot added size/L and removed size/M labels May 29, 2020
@pracucci pracucci requested a review from pstibrany May 29, 2020 07:34
@pracucci pracucci marked this pull request as ready for review May 29, 2020 07:36
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Based on discussion with Marco, chunk storage wouldn't benefit much from this optimization, because reasonable query_store_after parameter needs to be very small (chunks can be flushed to storage soon after series gets stale).

Thanks also for improving tracing!

@pracucci pracucci changed the title Do not query store for time ranges already covered by ingesters Do not query blocks store for time ranges already covered by ingesters May 29, 2020
@pracucci pracucci merged commit 7df217e into cortexproject:master May 29, 2020
@pracucci pracucci deleted the do-not-query-store-for-time-ranges-already-covered-by-ingesters branch May 29, 2020 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants