-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving GC for badger #6854
Moving GC for badger #6854
Conversation
dunno what's up with the gen-check failure, there is no information there. |
When you expand the second-to-last thing on circle it shows you the diff. The usual solution is to just run |
Effectiveness of moving GC: Before moving GC, after about a week with online GC in every compaction:
An hour after moving GC:
|
rebased on master. |
we can't really continue and leave a ticking bomb for the next restart; the user might not see it.
it was there to support a potential CopyTo interface; but we'll cross that bridge when we get there.
Changes per review comments (quite a few!):
|
ok, this is ready for the next round of review; in the meantime I'll deploy it in my discard node to get some timings and real-world burn. |
Timings are looking really good:
previously it was taking about 40min; that's 4.5 times faster. |
We really want this in, however, it's not a hard blocker for v1.11.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly naming comments but also some questions around file names.
TL;DR: When I have 2 databases and 4 different paths (old, new, link target, backup), the names really matter.
panic(fmt.Errorf("error renaming old badger db dir from %s to %s: %w; USER ACTION REQUIRED", dbpath, oldpath, err)) //nolint | ||
} | ||
|
||
if err = os.Symlink(path, dbpath); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I wasn't using symlinks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that a problem?
The issue with not using symlinks is moving across filesystems, which can take a long time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, what if I had the database at "foo". Now I'll end up with a symlink at "foo" pointing to a random directory.
If I started out with plain directories, I'd rather end up with plain directories. It's not critical, but the current version is going to confuse users.
- They initialize their repo.
- They see where their hotstore is.
- The hotstore gets moved and replaced by a symlink.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, it complicates things to try and keep it symlink-free, as we'll need two versions of things.
We should probably be clear to users that the splitstore owns the splitstore directory; they shouldn't muck with it.
The coldstore is another matter though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
dbpath := b.opts.Dir | ||
oldpath := fmt.Sprintf("%s.old.%d", dbpath, time.Now().Unix()) | ||
|
||
if err = os.Rename(dbpath, oldpath); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this just rename the symlink? Don't we need to move the actual file?
We need integration tests for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see... we are renaming the symlink intentionally. Why not just delete it then delete linkPath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it's fine -- we'll delete deep in deleteDB which follows symlinks.
Also, I don't think we need an integration test for this, we can test pretty much everything with unit tests as the changes are pretty self-contained.
What do you want to add to the unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it's fine -- we'll delete deep in deleteDB which follows symlinks.
It's fine, but a bit weird.
What do you want to add to the unit test?
The unit test is fine. I thought there was a bug here which would have been caught by a basic test. I revisited this because I then did look at the test, and it clearly checked this case.
Your tests are good.
Other feedback: The renaming logic with the symlink dance is strange. We should be preserving symlinks if the user is using them, and not using symlinks if the user isn't already using them.
|
Wait, is there a realistic situation where lotus would run in a filesystem that doesn't support symlinks?
Gah, you are right about this. We are creating absolute links, I'll fix in follow up. |
Well... probably not. Given that NFS does. Although someone probably does it anyways... Sorry, I've had some recent encounters with symlinks. |
Follw up in #6905 |
Cherry-picked from #6728.
This adds the option to perform moving GC on a badger blockstore; this effectively reclaims all space.
It is also hooked into the splitstore, to perform moving GC once every 20 compactions (about once a week, user configurable).
Rationale: badger supports online GC, which is quite fast but ultimately ineffective at reclaiming space. It has been observed that the hotstore size slowly creeps up, despite frequent garbage collections.
This addresses the situation with real moving GC.