-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tsdb/inverted index #6233
Tsdb/inverted index #6233
Conversation
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
- distributor -0.3%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, have a view questions though.
// we can _exactly_ match ob1 => (ob10, ob11) and know all fingerprints in those | ||
// resulting shards have the requested ob1 prefix (don't need to filter). | ||
var filter bool | ||
if shard.Of > len(ii.shards) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between len(ii.shards)
and ii.totalShards
? Since it seems that the length of ii.shards
does not change, we could use ii.totalShards
instead.
if shard.Of > len(ii.shards) { | |
if shard.Of > ii.totalShards { |
// the requested shard's min/max fingerprint values | ||
// in order to calculate the indices for the inverted index's | ||
// shard factor. | ||
requiredBits := index.NewShard(0, uint32(len(ii.shards))).RequiredBits() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for index.NewShard()
, why not index.NewShard(0, ii.totalShards)
?
if len(matchers) == 0 { | ||
for i := range shards { | ||
fps := shards[i].allFPs() | ||
result = append(result, fps...) | ||
} | ||
} else { | ||
for i := range shards { | ||
fps := shards[i].lookup(matchers) | ||
result = append(result, fps...) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this extra logic for no matchers be abstracted into the lookup()
function of indexShard
?
I saw that we do the same logic in the Lookup()
function of InvertedIndex
as well.
} | ||
|
||
if 1<<(shard.TSDB().RequiredBits()) != shard.Of { | ||
return fmt.Errorf("Shard factor must be a power of two, got %d", shard.Of) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error strings should not be capitalized
return fmt.Errorf("Shard factor must be a power of two, got %d", shard.Of) | |
return fmt.Errorf("shard factor must be a power of two, got %d", shard.Of) |
require.Nil(t, err) | ||
} | ||
|
||
func Test_BitPrefixDeleteAddLoopkup(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo
func Test_BitPrefixDeleteAddLoopkup(t *testing.T) { | |
func Test_BitPrefixDeleteAddLookup(t *testing.T) { |
// for _, shard := range []uint32{2, 4, 8, 16, 32, 64, 128} { | ||
for _, shard := range []uint32{2} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why you only tested for 2 shards?
return nil, err | ||
} | ||
|
||
var extractor func(unlockIndex) []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdyt about extracting a type for this function?
@owen-d I was late to the game and submitted my review only after the PR was already merged. Any chance you could answer the questions if it's not too much effort? 😇 |
In order to maintain query consistency, we must ensure the same shard (i.e.
1_of_16
) corresponds to the same series. Since TSDB calculates shard inclusion differently (by using bit prefixes instead of modulos), we need to account for this change when resolving streams in the ingester's inverted index.This is the first PR which doesn't use the new code paths.
Also, because the bit prefixed inverted index will support dynamic sharding of higher or lower shard factors, it occasionally needs to do additional computation when translating higher granularity shard factors to lower granularity ones (i.e. 16 -> 4).
ref #5428