Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: chain: create a new VM for each epoch #7966

Merged
merged 1 commit into from
Jan 18, 2022
Merged

Conversation

arajasek
Copy link
Contributor

@arajasek arajasek commented Jan 17, 2022

Related Issues

Fixes bug introduced in #7818. The VM's network version was always being determined based on epoch, the input to the ApplyBlocks method, even though the vm executes (potentially) multiple epochs worth of messages. The result is that if there was a null epoch before the upgrade epoch, the VM used the wrong network version (the new version) for running cron before the migration.

Proposed Changes

Although this was a fairly specific case, I think this issue shows the risks of trying to hamfistedly reuse the VM for multiple epochs of execution (see #7890 for a similar issue). This PR changes behaviour to make VMs one-per-epoch. We remove the old SetBlockHeight method entirely, and always create a new VM when moving between epochs. The result is that fields that need updating between epochs (like network version) are guaranteed to update this way.

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

  • All commits have a clear commit message.
  • The PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, INTERFACE BREAKING CHANGE, CONSENSUS BREAKING, build, chore, ci, docs, misc,perf, refactor, revert, style, test
    • area: api, chain, state, vm, data transfer, market, mempool, message, block production, multisig, networking, paychan, proving, sealing, wallet
  • This PR has tests for new functionality or change in behaviour
  • If new user-facing features are introduced, clear usage guidelines and / or documentation updates should be included in https://lotus.filecoin.io or Discussion Tutorials.
  • CI is green

@arajasek arajasek requested a review from a team as a code owner January 17, 2022 21:38
@codecov
Copy link

codecov bot commented Jan 17, 2022

Codecov Report

Merging #7966 (9ec1abf) into master (f628e12) will decrease coverage by 0.15%.
The diff coverage is 40.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7966      +/-   ##
==========================================
- Coverage   39.33%   39.17%   -0.16%     
==========================================
  Files         658      658              
  Lines       71142    71135       -7     
==========================================
- Hits        27985    27869     -116     
- Misses      38341    38443     +102     
- Partials     4816     4823       +7     
Impacted Files Coverage Δ
chain/vm/vm.go 62.69% <ø> (-0.11%) ⬇️
chain/consensus/filcns/filecoin.go 51.73% <20.00%> (-0.56%) ⬇️
chain/consensus/filcns/compute_state.go 73.96% <50.00%> (+0.70%) ⬆️
itests/kit/blockminer.go 74.11% <0.00%> (-17.06%) ⬇️
markets/loggers/loggers.go 89.28% <0.00%> (-10.72%) ⬇️
chain/actors/builtin/miner/diff.go 48.52% <0.00%> (-10.30%) ⬇️
node/impl/full/mpool.go 25.21% <0.00%> (-8.70%) ⬇️
extern/storage-sealing/sector_state.go 92.30% <0.00%> (-7.70%) ⬇️
chain/stmgr/call.go 67.87% <0.00%> (-7.28%) ⬇️
chain/stmgr/execute.go 86.95% <0.00%> (-4.35%) ⬇️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f628e12...9ec1abf. Read the comment docs.

}

pstate, err = vmi.Flush(ctx)
pstate, err = vmCron.Flush(ctx)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the repeated use of the same pstate and err here -- would appreciate suggestions on how to clean this up / derisk this. Winding up with the wrong field in pstate would be very bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always use a temporary then set pstate = newState. But this doesn't seem too bad.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a reasonable change. I can't find anything wrong with it, at least.

}

pstate, err = vmi.Flush(ctx)
pstate, err = vmCron.Flush(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always use a temporary then set pstate = newState. But this doesn't seem too bad.

Copy link
Contributor

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to make sense

@arajasek arajasek merged commit c0f29ea into master Jan 18, 2022
@arajasek arajasek deleted the asr/compute-state-vm branch January 18, 2022 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants