-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC: FVM Debug Dual Execution #8841
Conversation
CI will have to rerun once the ffi helper publishes the relevant build artifacts (not there yet, have been testing locally with a devnet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking this on! As a PoC, I think this looks good. I think using envvars to dictate where to look for potential overrides is good enough.
The biggest thing I'd like to see is some way to automatically make this happen for select state computations (eg. when estimating gas, or when StateReplay
is called). That is not directly relevant to this work, but really needs to happen soon.
@arajasek id prefer this PR to be base off master, let me know your thoughts! |
I'm not sure -- there are users who want to use them, and ideally they would be able to do so in the minimum release. But it is ultimately nice-to-have, and any change late in the release process is undesirable, so happy to go into master instead. |
I'll leave the rebase up to you, updating the code now; I just have to move the util functions and it's done. |
8c0675d
to
19c9e87
Compare
865d5a3
to
330344f
Compare
330344f
to
8e4d42c
Compare
chain/vm/vmi.go
Outdated
return NewFVM(ctx, opts) | ||
} | ||
|
||
// Remove after v16 upgrade, this is only to support testing and validation of the FVM | ||
if useFvmForMainnetV15 && opts.NetworkVersion >= network.Version15 { | ||
if os.Getenv("LOTUS_FVM_DEBUG") == "1" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay with this duplication cuz it's getting dropped soon
8e4d42c
to
a52d584
Compare
Codecov Report
@@ Coverage Diff @@
## master #8841 +/- ##
==========================================
- Coverage 40.73% 40.62% -0.12%
==========================================
Files 705 705
Lines 78574 78692 +118
==========================================
- Hits 32009 31965 -44
- Misses 41101 41239 +138
- Partials 5464 5488 +24
|
go func() { | ||
defer wg.Done() | ||
ret, err = vm.main.ApplyMessage(ctx, cmsg) | ||
}() | ||
|
||
go func() { | ||
defer wg.Done() | ||
if _, err := vm.debug.ApplyMessage(ctx, cmsg); err != nil { | ||
log.Errorf("debug execution failed: %w", err) | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... this doesn't work as the debug messages will have different gas fees. That means:
- The balances will be different.
- Some messages may just fail.
This will lead to really annoying and hard to debug issues. We can merge this as a WIP_JUST_TESTING patch, but we need to make that clear.
The right way to do this would be to:
- Apply in debug mode (disabling gas accounting?). Balances will be correct because we charge for gas by charging for the gas limit up-front, then refunding any leftovers. So the balance will be correct for the duration of the message execution, just not after the message is done executing.
- Revert.
- Apply normally.
- Go back to 1 for the next message.
But:
- We can't currently perform that revert.
- That is even more likely to cause problems...
I think the answer here is to leave this as a super experimental feature, but be very clear that this is just for debugging and absolutely not to be relied on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll throw on a disclaimer.
Honestly, the biggest use I envision getting out of this is actually intentionally failing messages, and using the actor error string to convey debug information (eg. adding actor_error!("i reached this if branch because the miner power is non-zero");
to the miner_actor).
var newManifestData manifest.ManifestData | ||
if err := store.Get(ctx, newManifest.Data, &newManifestData); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause us to load the ManifestData twice, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, solely to check the number of entries, which I can't do from newManifest directly because there's no way (yet) to count the number of entries in a Manifest
...
See filecoin-project/ref-fvm#592
Depends on:
How it works
When
LOTUS_FVM_DEBUG=1
, dual execution is triggered with actor debugging for side effect. If alsoLOTUS_FVM_DEBUG_BUNDLE_V8
is also specified, then the bundle is loaded and execution is redirected from the canonical (consensus) actors to the actors in thr bundle during debug execution.Some niceties to consider
output
syscall to fvm that captures output during debug execution and is noop otherwise.