Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitstore Garbage Collection #6728

Closed
wants to merge 33 commits into from
Closed

Splitstore Garbage Collection #6728

wants to merge 33 commits into from

Conversation

vyzo
Copy link
Contributor

@vyzo vyzo commented Jul 11, 2021

Implements online coldstore chain pruning and garbage collection with the lotus chain prune command.

See #6577 for discussion.

From the README:
The coldstore can be pruned and garbage collection using the lotus chain prune command.
Note that the command initiates pruning, but runs asynchronously as it can take a long time to
complete.

By default, pruning keeps all chain reachable object; the user however has the option to specify
a retention policy for old state roots and message receipts with the --retention option.
The value is an integer with the following semantics:

  • If it is -1 then all state objects reachable from the chain will be retained in the coldstore.
    This is the (safe) default.
  • If it is 0 then no state objects that are unreachable within the compaction boundary will
    be retained in the coldstore.
    This effectively throws away all old state roots and it is maximally effective at reclaiming space.
  • If it is a positive integer, then it's the number of finalities past the compaction boundary
    for which chain-reachable state objects are retained.
    This allows you to keep some older state roots in case you need to reset your head outside
    the compaction boundary or perform historical queries.

During pruning, unreachable objects are deleted from the coldstore. In order to reclaim space,
you also need to specify a gargage collection policy, with two possible options:

  • The --online-gc flag performs online garbage collection; this is fast but does not reclaim all
    space possible.
  • The --moving-gc flag performs moving garbage collection, where the coldstore is moved,
    copying only live objects. If your coldstore lives outside the .lotus directory, e.g. with
    a symlink to a different file system comprising of cheaper disks, then you can specify the
    directory to move to with the --move-to option.
    This reclaims all possible space, but it is slow and also requires disk space to house the new
    coldstore together with the old coldstore during the move.

@vyzo vyzo changed the base branch from master to feat/splitstore-redux July 11, 2021 14:53
@vyzo vyzo added the team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs label Jul 11, 2021
@vyzo vyzo force-pushed the feat/splitstore-gc branch from 4d184ec to f374662 Compare July 13, 2021 09:10
@vyzo
Copy link
Contributor Author

vyzo commented Jul 13, 2021

rebased for changes in the base branch.

Base automatically changed from feat/splitstore-redux to master July 13, 2021 10:43
@vyzo vyzo force-pushed the feat/splitstore-gc branch from f374662 to 997e004 Compare July 13, 2021 14:56
@vyzo
Copy link
Contributor Author

vyzo commented Jul 13, 2021

rebased on master.

@vyzo vyzo marked this pull request as ready for review July 14, 2021 12:54
@vyzo vyzo requested a review from Stebalien July 14, 2021 12:54
@vyzo
Copy link
Contributor Author

vyzo commented Jul 14, 2021

I'll rebase and resolve the gen conflicts when we are ready to merge.

@Stebalien
Copy link
Member

This needs integration tests. I.e., tests where we trigger GC while computing a tipset or at least something that approximates computing a tipset (i.e. a bunch of reads, then vm.Copy). Ideally:

  1. Start GC before computing, and during.
  2. End GC during and after.

VM execution doesn't just write blocks, it makes a bunch of assumptions about what we have and what we don't have.

@vyzo
Copy link
Contributor Author

vyzo commented Jul 14, 2021

Agreed; for now I am testing in a live node, but we definitely want a proper integration test.
I propose we add it to the Harness test effort outlined in #6725

How does it sounds to you?
It something we can reasonably tackle without time pressure after the v1.11.1 release code freeze.

@Stebalien
Copy link
Member

It's really a waste of time to review until there are tests. Testing on a node isn't likely to trigger the conditions we care about.

@Stebalien
Copy link
Member

(also, let's please break this into two PRs: one that refactors (splits files), then one that makes the actual change)

@vyzo
Copy link
Contributor Author

vyzo commented Jul 14, 2021

Agreed with the sentiment, following our sync discussion.
I'll cherry-pick the refactor out for easy review, and then the plan is to write a good enough integration test.
If we can get it done before the deadline, great.
If not, we don't have to ship it now, and people who are willing to take the risk can always use the pr directly.

@vyzo vyzo force-pushed the feat/splitstore-gc branch from a82d732 to 0ba34a9 Compare July 14, 2021 17:54
@vyzo vyzo changed the base branch from master to feat/splitstore-refactor-redux July 14, 2021 17:54
Base automatically changed from feat/splitstore-refactor-redux to feat/splitstore-refactor July 14, 2021 18:11
@Stebalien Stebalien force-pushed the feat/splitstore-refactor branch from c98ab56 to 68da14c Compare July 14, 2021 18:15
@vyzo vyzo changed the base branch from feat/splitstore-refactor to feat/splitstore-reorg July 14, 2021 18:22
@vyzo vyzo marked this pull request as draft July 14, 2021 19:30
@vyzo vyzo force-pushed the feat/splitstore-reorg branch from 02ef0de to 05b6ec9 Compare July 14, 2021 20:01
@Stebalien Stebalien force-pushed the feat/splitstore-reorg branch from 05b6ec9 to 5a23f64 Compare July 14, 2021 20:11
Base automatically changed from feat/splitstore-reorg to master July 14, 2021 23:57
@vyzo vyzo force-pushed the feat/splitstore-gc branch from 0ba34a9 to c51cada Compare July 15, 2021 04:42
@jennijuju jennijuju added this to the 1.11.1 milestone Jul 19, 2021
@vyzo vyzo mentioned this pull request Jul 23, 2021
@BlocksOnAChain
Copy link
Contributor

@vyzo should we close this one since it's related to 1.11.1 which is out?
FYI: @jennijuju

@vyzo
Copy link
Contributor Author

vyzo commented Sep 10, 2021 via email

@jennijuju jennijuju modified the milestones: 1.11.1 , v1.13.1 Oct 12, 2021
@Kubuxu Kubuxu closed this Nov 25, 2021
@Kubuxu Kubuxu deleted the feat/splitstore-gc branch November 25, 2021 19:01
@Kubuxu Kubuxu restored the feat/splitstore-gc branch November 25, 2021 19:08
@Kubuxu Kubuxu reopened this Nov 25, 2021
@ZenGround0 ZenGround0 self-assigned this Jun 21, 2022
@ZenGround0 ZenGround0 removed the team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs label Jun 27, 2022
@ZenGround0 ZenGround0 mentioned this pull request Jul 19, 2022
12 tasks
@magik6k
Copy link
Contributor

magik6k commented Dec 5, 2022

Done in #9056

@magik6k magik6k closed this Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants