Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store: Add min resolution to be specific what downsampling levels to use on query #1104

Closed

Conversation

mjd95
Copy link
Contributor

@mjd95 mjd95 commented May 2, 2019

Changes

Refactor the series interface to take a list of allowed resolutions instead of a max resolution. This is to allowed viewing e.g. only 5m downsampled data, which is a useful debugging tool. (c.f. #922)

WIP as I'm currently having issues generating the protos via the Makefile (probably something about my setup). Also wondering whether we should have stronger typing on the new fields (e.g., a repeated enum rather than a repeated int64).

Verification

Only updated the unit tests to pass so far, happy to add more.

@mjd95
Copy link
Contributor Author

mjd95 commented May 2, 2019

Let me know what you think of this approach @bwplotka

Copy link
Member

@povilasv povilasv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I like it 🥇

Could you add changelog entry and tests.
I think we are not covering the case where we explictly request some resolution.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Can you share the Make proto issue you have?

So max resolution was simple. We could put 2h and essentially we take every downsamplng level that is not higher than 2h (so 1h in our case). If we don't have 1h we take 5m etc.

Allowed resolution method is tricky as it assumes the resolution apriori. Someday maybe we will give more flexibility on downsampling? Then the migration would be painful etc

I have just yolo idea. What if we would add just min_resolution to the proto? This gives us full flexibility on debugging but also other use cases for choosing exactly one resolution. Thoughts? We can make this enum on Querier side (just one side) to chose from given list, but the implementation would work on min and max resolution.
It also gives a way for other use cases like use any downsampling but not raw data, because it's expensive. What do you think?

Just curious have you found any bug in test cases or getFor? Something is not right there overall, wonder if you found anything.

Curious what others think of it: @povilasv @GiedriusS @domgreen @brancz ? It is essentially super useful for debugging and more explicit than current max resolution. With ONLY max resolution it's not immidiately clear that it fallbacks to lower resolution if not having the requested ones. If we have both max and min I think it would be more clearly and powerful.

@mjd95
Copy link
Contributor Author

mjd95 commented May 9, 2019

Thanks for the comments @povilasv @bwplotka

Min source resolution sounds like a good idea. It covers all the use cases (just debugging basically) I had for allowedResolution and it a bit more flexible as well.

Will have a look at this again tomorrow, will check if I can repro the proto issue.

@bwplotka bwplotka changed the title Store: Refactor series interface (WIP) Store: Add min resolution to be specific what downsampling levels to use on query May 9, 2019
@GiedriusS
Copy link
Member

I completely agree with @bwplotka but I just had one idea while working through getFor's code: we should probably prioritize the higher resolution data in general. For example, the user specifies that they want max 1h resolution and raw resolution as the minimum. In such a case, we should probably opt to first fill the requested time range with the raw resolution where possible and only then fill it up with the downsampled one. It would probably make more sense from the user's perspective if they are using a small range vector in their query. Such a change would give more results to the users compared to selecting the data with the worst resolution first.

Thoughts?

@mjd95 mjd95 force-pushed the refactor-store-series-interface branch from ff44fd2 to 4b4e3ef Compare May 22, 2019 18:56
@mjd95
Copy link
Contributor Author

mjd95 commented May 22, 2019

Changed to use the min resolution suggestion, which is much nicer :) I also rebased on top of #1132 for building the protos, but the main changes for this PR are still the ones to the store API.

@GiedriusS that's an interesting question whether we should prefer raw data or downsampled data. Preferring the downsampled like we do now does mean we're sending less data overall which is good, but on the other hand having the raw data is more useful for the user. I think what we're doing now is reasonable but would have to think about it more, there's definitely a trade-off here.

Edit - just saw #1170 👍

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIce, but we need a rebase as we fixed this part. I think it should not change much of you algorithm.

Thanks for this and sorry for review delay.

}

// The covering is greedy, preferring blocks of the lowest allowed resolutions.
func (s *bucketBlockSet) getFor(minTime, maxTime, minResolution, maxResolution int64) (bs []*bucketBlock) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We change a big for this part, so wonder how it will look like after rebase, but generally it looks like something we want 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 My changes are very small now actually. Previously I had refactored this function to not be recursive, but I don't think that it helpful any more (the recursion does a nice job of filling in the gaps).

pkg/store/bucket.go Outdated Show resolved Hide resolved
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know when we should take another look (:

@brancz brancz requested a review from bwplotka June 4, 2019 10:17
@mjd95 mjd95 force-pushed the refactor-store-series-interface branch from b71c5d2 to 84ba758 Compare June 11, 2019 12:48
@mjd95
Copy link
Contributor Author

mjd95 commented Jun 11, 2019

Sorry for the delay, PTAL @bwplotka

@mjd95
Copy link
Contributor Author

mjd95 commented Jun 12, 2019

Closing as the only current use case for adding this was debugging #922, but there is a plan to add a UI for thanos bucket web which seems like it will cover this debugging case and more.

Discussed whether there might be any other use cases. Potentially, but we don't have any FRs yet it's easy enough to add if we get one :)

@mjd95 mjd95 closed this Jun 12, 2019
@mjd95 mjd95 deleted the refactor-store-series-interface branch June 12, 2019 09:03
@bwplotka
Copy link
Member

Thanks for this @mjd95

We agreed offline that the debug use case that this is solving is solved in better way by this: #1246 (cc @jaseemabid), so postponing this until more use cases appears. (:

@mjd95 mjd95 restored the refactor-store-series-interface branch June 12, 2019 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants