Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retention speedup #4172

Merged
merged 3 commits into from
Aug 19, 2021
Merged

Conversation

sandeepsukhani
Copy link
Contributor

What this PR does / why we need it:
It contains the following changes to speed up the mark phase of retention:

  1. We delete a chunk only when the whole chunk is out of retention. If a chunk spans multiple tables, we will retain its entry in all those tables until the whole chunk expires, even if the table is not expected to have it based on retention config. This PR changes the approach to drop the chunk index entry early by checking the interval's end time for which a table is expected to have chunks indexed.

  2. We clean up the series by iterating over the index entries to find which ones should be deleted. Because of the index format, we need to go through all the label entries(including labels for unrelated series) for the user. This PR changes the code to build the index keys using GetCacheKeysAndLabelWriteEntries, which can be used to delete the index entries expected to be deleted efficiently.

  3. When a chunk is deleted partially, we mark it for deletion. If a chunk spans multiple tables, we will mark it for deletion when the first table that indexes it is processed. If, for some reason, the compactor goes down or delays applying retention to the next tables, the compactor would fail to find the source chunk(which is needed for re-building a new chunk), which would fail the retention. This PR changes the code to mark source chunk for deletion when the table being processed is the last change that indexes it.

Checklist

  • Tests updated

@sandeepsukhani sandeepsukhani requested a review from a team as a code owner August 16, 2021 12:26
@jeschkies jeschkies requested review from owen-d and jeschkies August 17, 2021 12:20
Copy link
Contributor

@jeschkies jeschkies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code wise this looks good to me. Unfortunately, I don't know enough to judge the logic.

@sandeepsukhani
Copy link
Contributor Author

Code wise this looks good to me. Unfortunately, I don't know enough to judge the logic.

Thanks @jeschkies for the review! I am going to merge this and let this go through our internal release cycle. I am also testing this in parallel with a dev cluster.

@sandeepsukhani sandeepsukhani merged commit 29b07d1 into grafana:main Aug 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants