-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LabelNames API with matchers #9083
Conversation
|
||
func (r blockIndexReader) LabelNames(matchers ...*labels.Matcher) ([]string, error) { | ||
if len(matchers) == 0 { | ||
return r.b.LabelNames() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This calls Block.LabelNames
which calls the index.Reader.LabelNames
and it's the only call there and it has no matchers. You'll find a TODO later to implement matchers there, but actually we should just move the matchers evaluation from here to index.Reader.LabelNames
.
I would love to do that by moving labelNamesWithMatchers
to index.LabelNamesWithMatchers
, however, that's a bigger refactor since we'd need to move the PostingsForMatchers
and even the IndexReader
interface there too. I also think that it makes sense, but it's just out of the scope of this PR.
@@ -479,6 +487,12 @@ func (r blockIndexReader) LabelValueFor(id uint64, label string) (string, error) | |||
return r.ir.LabelValueFor(id, label) | |||
} | |||
|
|||
// LabelNamesFor returns all the label names for the series referred to by IDs. | |||
// The names returned are sorted. | |||
func (r blockIndexReader) LabelNamesFor(ids ...uint64) ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needing to implement this here feels cumbersome to me, I feel that the interface we're implementing it shouldn't force us to do that, i.e. it should be split into two. But I don't have enough knowledge of the packet yet to have a better picture.
labelNames := h.head.postings.LabelNames() | ||
h.head.symMtx.RUnlock() | ||
if len(matchers) == 0 { | ||
h.head.symMtx.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept this but I can't see why it's needed: h.head.postings
is a MemPostings
, and MemPostings.LabelNames
locks itself, same happens for LabelValues
above. the symMtx
seems to be a mutex to be used to acces h.head.symbols
but we're not accessing that in any way here so...
🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember there was some race somewhere, but cant remember. Might as well have it like before :)
if err != nil { | ||
return nil, storage.ErrNotFound | ||
} | ||
for _, off := range offsets { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking first to have an [][]uint32 allOffsets
with all offsets slices here and then make an k-way merge using a heap. However, that would mean that allOffsets could potentially be the entire copy of the all the series with their symbols offsets, which seemed unnecessary so I decided to go with a map and keep deduplicating them as we get them.
return nil, err | ||
} | ||
|
||
var postings []uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could potentially be the list of all series in the block, like if the only matcher we pass in matches all the series. But I think we'll save important roundtrips later in r.LabelNamesFor
by grabbing them all here.
@@ -1168,6 +1168,7 @@ func (m mockIndex) LabelValues(name string, matchers ...*labels.Matcher) ([]stri | |||
for _, series := range m.series { | |||
for _, matcher := range matchers { | |||
if matcher.Matches(series.l.Get(matcher.Name)) { | |||
// TODO(colega): shouldn't we check all the matchers before adding this to the values? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have N matchers here and we saw that series matches one of them, so we're putting it into values, that's wrong, right?
This is the mock index so 🤷 , but if you confirm, I'll fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through all the files and the high level approach LGTM, it is similar to what we do for LabelValues. I will take a detailed look next week. Thanks for working on this!
NB This doesn't actually implement it in the index, just plumbs it through for now... Signed-off-by: Tom Wilkie <tom@grafana.com>
Signed-off-by: Tom Wilkie <tom@grafana.com>
Can't see why do we need to hold a mutex on symbols, and the purpose of the LabelNamesFor method. Maybe I'll need to re-add this later. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
This method provides the label names that appear in the postings provided. We do that deeper than the label values because we know beforehand that most of the label names we'll be the same across different postings, and we don't want to go down an up looking up the same symbols for all different series. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
However, I still don't understand why do we need a mutex here. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
I still don't see why we need to grab that unrelated mutex, but at least now we're grabbing it consistently Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
I have force-pushed the branch rebased from the latest master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I only have comments about comments and error handling.
Additionally, I don't see any tests cases in api_test.go which is having >1 matcher groups. Since we modified that part, could you please add 1-2 test cases to cover that?
labelNames := h.head.postings.LabelNames() | ||
h.head.symMtx.RUnlock() | ||
if len(matchers) == 0 { | ||
h.head.symMtx.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember there was some race somewhere, but cant remember. Might as well have it like before :)
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Added the testcases: 17c4594 |
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Updated Prometheus to latest commit that includes changes in the LabelQuerier.LabelNames method signature: prometheus/prometheus#9083 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Upgrade prometheus to b1ed4a0a663d0c62526312311c7529471abbc565 Updated Prometheus to latest commit that includes changes in the LabelQuerier.LabelNames method signature: prometheus/prometheus#9083 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Add matchers to all LabelNames implementations Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Pass matchers from distributor to ingester's block tsdb Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Ingester.LabelNames(w/matchers) for chunks too Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Use the old distributor query path when cfg is disabled Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Sort label names in old implementations Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * BlockStoreQueryable querying both paths Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Fix pkg/querier/querier_test.go Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Use the provided ctx Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Make doc * Test distributorQueryable.LabelNames And fix missing matchers in the call to Distributor Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Test Distributor.LabelNames Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Test Ingester.LabelNames with matchers Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Add matchers to querier test Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Add CHANGELOG.md entry Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Move changes to mimir/unreleased changelog section Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Add another label to testcase to improve test If multiple matchers didn't work, we wouldn't realize as the result set was the same as for a single matcher. Now the multiple matcher doesn't match the series with status=500 which has a reason=broken label Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Update docs, add `enabled` suffix to the config. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Remove fixme comment and feature flag from blocks_store_queryable All the StoreGateways (actually, one) are already supporting matchers in the call so we don't need to use a feature flag on this. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Update span name Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Marco Pracucci <marco@pracucci.com> * Remove feature flag from blocks_store_queryable Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Fixup changelog entry Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Remove fingerprint collision testcase At least until we find a collision between two different label names. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Handle matchers in tenantfederation/mergeQuerier.LabelNames Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Opinionated refactor of filterValuesByMatchers Since there are three options, IMO, a switch is clearer than if/continue/if/elseif structure. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>
Since prometheus/prometheus#9083 prometheus now provides matchers to the LabelNames method on the LabelQuerier interface, so in order to upgrade Prometheus we need to support that. This partially solves cortexproject#3658 as now the block store queryable uses the LabelNames method with matchers. However, we're still using the ingesters' MetricsForLabelMatchers method to perform the LabelNames w/matchers call on the distributor. That change will be tackled separately as it breaks ingester's contracts and needs to be released with a feature flag to perform a backwards compatible release. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Upgrade Prometheus to LabelNames with matchers Since prometheus/prometheus#9083 prometheus now provides matchers to the LabelNames method on the LabelQuerier interface, so in order to upgrade Prometheus we need to support that. This partially solves #3658 as now the block store queryable uses the LabelNames method with matchers. However, we're still using the ingesters' MetricsForLabelMatchers method to perform the LabelNames w/matchers call on the distributor. That change will be tackled separately as it breaks ingester's contracts and needs to be released with a feature flag to perform a backwards compatible release. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Updated CHANGELOG.md Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
This bumps prometheus up to the changes from prometheus/prometheus#9083 and Cortex to cortexproject/cortex#4380 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
This bumps prometheus up to the changes from prometheus/prometheus#9083 and Cortex to cortexproject/cortex#4380 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
* Upgrade Prometheus to LabelNames with matchers Since prometheus/prometheus#9083 prometheus now provides matchers to the LabelNames method on the LabelQuerier interface, so in order to upgrade Prometheus we need to support that. This partially solves cortexproject#3658 as now the block store queryable uses the LabelNames method with matchers. However, we're still using the ingesters' MetricsForLabelMatchers method to perform the LabelNames w/matchers call on the distributor. That change will be tackled separately as it breaks ingester's contracts and needs to be released with a feature flag to perform a backwards compatible release. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Updated CHANGELOG.md Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Signed-off-by: Alvin Lin <alvinlin@amazon.com>
This modifies the
LabelQuerier.LabelNames()
method adding the matchers to it, similar to it'sLabelValues
counterpart.Previously, the matchers for
LabelNames
were supported only on the HTTP API surface, then the HTTP API implementation made use of theSelect()
call with special hints to tell TSDB to skip the samples.This however is hacky and the implementation divergence causes issues in other projects depending on this, like cortex: cortexproject/cortex#3658
This new approach does what one would expect to: call the
LabelMatchers
providing the matchers, and let the TSDB internals decide how to get that data. As a side effect, we're saving lots of resources, especially on deployments where there are tons of series with different label values but same label names (I'd say this is a common scenario in a cloud-native world, isn't it?)The work was initiated by @tomwilkie and I finished some details.
See more comments on the PR for thoughts.