-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add querier metric for block source and compaction level #7112
Conversation
e7fe57a
to
75042c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great. I never knew this is is "knowable". It could serve as a reliable indicator for the compactor falling behind.
87de38c
to
08630e1
Compare
Updated the PR to move this metric to the store-gateway itself, and use a summary to be consistent with cortex_bucket_store_series_blocks_queried. There don't seem to be any related tests to update. I'm on the fence about whether this metric should be about compacted or non-compacted blocks - input appreciated. My immediate use case is mostly interested in non-compacted blocks, but @dimitarvdimitrov pointed out that understanding compacted blocks could be useful as well. Either one can be derived from the other, along with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jonathan for re-iterating on it. I left a couple of final comments. About tests to improve, I think you can look at the test using assertQueryStatsMetricsRecorded()
.
The potential value that I see is to be able to tell when store-gateways start querying non-compcated 2-hour blocks. If the compactor falls behind on compacting 12-hour blocks it's not that big of a deal, so I don't think observability there is as critical. So if the new metric you are adding only counts non-compacted blocks, then I think it will still serve this purpose. If we cover all blocks via the (level, source) tuple Marco suggested might future-proof us, so if it's not too hard, maybe we should do that? |
Adds a querier metric, cortex_querier_compacted_blocks_queried_total, which indicates the number of blocks fetched from store gateways that were compacted. This can be compared to cortex_querier_blocks_queried_total.
- move metric to storegateway - rename metric to cortex_bucket_store_series_non_compacted_blocks_queried
This reverts commit 50d04932f40ba470e2f6e72a3e5ed9a6b82f8acd.
08630e1
to
1b5a1fc
Compare
1b5a1fc
to
5da75da
Compare
Moving to draft since testing in dev shows that some block meta isn't populated before creating the metrics. Feel free to comment/review otherwise. |
@dimitarvdimitrov I noticed that some block meta I was using wasn't being populated since it was being fetched via the BucketIndexMetadataFetcher rather than the MetadataFetcher, which I see was changed in #6808. Since compaction level and source aren't available in the bucket index, I reverted to using the MetadataFetcher (integration tests still use this). Let me know what you think. |
56026bf
to
e0c5cd2
Compare
I'm afraid this will make the store-gateway scan the bucket each every 15 minutes instead of relying on the bucket index. This makes scanning more brittle, slower, and adds costs for list operations (we have docs on the bucket index which contain some more details). Because of this I think it's best to keep using the bucket index scanner. One option to resolve this is to update the bucket index to start including these two items. |
I agree with what @dimitarvdimitrov said. It's not an option. Actually we want to get rid of |
This reverts commit fdd002a6f49f45f43a1919d77822aa17eccfb575.
755c4c9
to
6b52c5b
Compare
@dimitarvdimitrov I took your suggestion, thanks. The bucket index JSON field name is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for your patience with this PR :)
once this PR is merged, should I open a follow-up PR to do this or have you already started on it? |
I have not started that yet. If you'd like to do a follow up that would be great. |
What this PR does
Adds
source
and compactionlevel
to thecortex_bucket_store_series_blocks_queried
metric, which indicates the number of compacted blocks that were queried from store gateways.Which issue(s) this PR fixes or relates to
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.