-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for EVMC #17954
Support for EVMC #17954
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, would just like to clarify a few minor points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a minor series of questions
It's new year. Maybe it's time to finish this? |
Update:
|
Updated to EVMC 6.1.1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR in general looks ok, minor nits here and there. The three problematic parts are:
- A lot of sensitive consensus things are duplicated. That will bite back later when we need to modify them and won't be able to validate that EVMC still works correctly. Can we somehow unify the two implementations wrt gas prices, refunds, existence checks, etc?
- We must be able to construct multiple versions of EVMC, both with the same configs and with different configs to support concurrent executions.
- When creating an EVMC instance, I must be able to tell it whether to run in eWASM or EVM mode of operation. It must not hijack VM execution that is was not assigned. The current code will take away EVM execution from the built in VM even if I only enabled EVMC for eWASM.
createMu sync.Mutex | ||
evmcConfig string // The configuration the instance was created with. | ||
evmcInstance *evmc.Instance | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This global instance seems very dangerous long term (e.g. what happens if I want both execution and tracing with different options running concurrently? or block processing and mining concurrently? or concurrent transaction execution?).
How heavy is creating an EVMC instance? Do we have some numbers on it? Can't we create it on the fly when needed? Perhaps we could have a pool of them if they are expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally if we could have some sort of a map[config][]evmcInstance
(just an example for the idea). Then we could cover multiple code paths executing either the same or different EVMC instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplest is possibly a sync.Map
(thread safe) containing sync.Pool
objects (thread safe + auto GC) containing evmc.Instance
objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of global objects either, but it might be unavoidable in this case.
Let me start with some overview. First of all, the cost of creating an EVMC instance for interpreters is literally 0. They simply return their name and the ABI version. This is more expensive for EVMJIT where LLVM requires some global initialization to be performed. But anyway, the evmcInstance
is something else than an instance of thje Interpreter
. The evmcInstance
does not have any execution context attached to it. The execution context of a VM is hidden and handled internally by the Execute()
method which can be called concurrently. This also mean that e.g. you cannot inspect if an EVMC instance is in static mode currently. It simply represents a handle of dynamically loaded EVMC module.
That's why I think this is unavoidable for DLLs (you don't want to call dlopen for every contract execution) especially if they are configured with CLI options. There will be some maintenance cost related to that.
Configuration options for EVMC make it even more complicated. However, I'm leaning toward a decision to allow passing EVMC options only when an EVMC instance is created. If you need different options you need to create another instance. Moreover, tracing will be probably handled differently (not by EVMC options).
In this PR, you can only create single EVM and single EWASM instances so we have 2 different instances at most. I can try to use sync.Map
as a start (it should at least reduce the number of panic()
occurrences). Let me know what you think about it.
return output, err | ||
} | ||
|
||
func (evm *EVMC) CanRun(code []byte) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code segment will hijack EVM too, even if I only want to configure EVMC for eWASM. Please make the EVM specializable to either eWASM or EVM, and hard select one or the other based on what it was constructed for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, in practice you can still control this by EVMC options. E.g. for Hera you can set something like evm=off
so it will not report the EVM capability.
Secondly, I'm not sure what exactly are you asking for. Something like
func NewEVMC(ewasm bool, ...)
NewEVMC(false, ...)
NewEVMC(true, ...)
If yes, that sounds like a good change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of what Peter wants to say, and to summarize grossly, is that we want to have the equivalent of evm=on
. EVMC taking over EVM needs to be a conscious choice by the user, not the user.
Here, the default behavior of an EVMC backend client is to grab EVM code as soon as --vm.ewasm='libhera.so'
is set. We want the request for CapabilityEVM1
to only occur if the client expressedly said --vm.evm='libhera.so'
or something similar on the command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code segment will hijack EVM too, even if I only want to configure EVMC for eWASM. Please make the EVM specializable to either eWASM or EVM, and hard select one or the other based on what it was constructed for.
This has been addressed. Now we have independent module instances for EVM and Ewasm.
@gballet had a nice idea that we should probably move the sstore gas tricks into EVMC itself and just expose AddRefund and family. Otherwise the whole gas calculation in the EMVC Go wrapper is really consensus logic that should have been handled by EVMC itself, just leaked into Geth. |
List of things I have to do:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for detailed review. I handled the most of coding style issues and easy fixes.
With other changes I will probably need ask some additional questions so expect them soon.
createMu sync.Mutex | ||
evmcConfig string // The configuration the instance was created with. | ||
evmcInstance *evmc.Instance | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of global objects either, but it might be unavoidable in this case.
Let me start with some overview. First of all, the cost of creating an EVMC instance for interpreters is literally 0. They simply return their name and the ABI version. This is more expensive for EVMJIT where LLVM requires some global initialization to be performed. But anyway, the evmcInstance
is something else than an instance of thje Interpreter
. The evmcInstance
does not have any execution context attached to it. The execution context of a VM is hidden and handled internally by the Execute()
method which can be called concurrently. This also mean that e.g. you cannot inspect if an EVMC instance is in static mode currently. It simply represents a handle of dynamically loaded EVMC module.
That's why I think this is unavoidable for DLLs (you don't want to call dlopen for every contract execution) especially if they are configured with CLI options. There will be some maintenance cost related to that.
Configuration options for EVMC make it even more complicated. However, I'm leaning toward a decision to allow passing EVMC options only when an EVMC instance is created. If you need different options you need to create another instance. Moreover, tracing will be probably handled differently (not by EVMC options).
In this PR, you can only create single EVM and single EWASM instances so we have 2 different instances at most. I can try to use sync.Map
as a start (it should at least reduce the number of panic()
occurrences). Let me know what you think about it.
return output, err | ||
} | ||
|
||
func (evm *EVMC) CanRun(code []byte) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, in practice you can still control this by EVMC options. E.g. for Hera you can set something like evm=off
so it will not report the EVM capability.
Secondly, I'm not sure what exactly are you asking for. Something like
func NewEVMC(ewasm bool, ...)
NewEVMC(false, ...)
NewEVMC(true, ...)
If yes, that sounds like a good change.
Yes, this is planned for EVMC7. It was mistake to leave refund calculation responsibility to the Host. The Constantinople changes showed the problem clearly. So I don't think this can be addressed imminently, but my plan is to return the "refund counter" next to "gas left" in the execution result struct. Question: should it return already aggregated refund from subcalls? About It also improves UX. Currently, the EVMC is loaded on first use, which can happen quite late (in case of sync on the first TX execution which can be after hours). Users probably what to know earlier that the path they provided to the EVMC shared library is wrong. |
The potential issue that I foresee is with error handling: making sure that the correct gas is accounted for in case of an exception in the external library might quickly become a management hell, with all these sub-sub-sub-cases. But for all practical purposes, I don't think it makes any difference on the geth side.
That is certainly desirable, however how to you make sure that you can pre-instantiate every configuration? Also, some VMs (i.e. wagon) are stateful so you can't really instantiate it once and for all. You could only pre-instantiate the interpreter, then perform VM allocation when it's needed. But in any case loading could fail based on the content of the contract. so returning an error makes sense. It's not the only solution (the other one being to load the contract at a later stage) but it's the less intrusive one. |
Results from short review call with @karalabe:
|
@fjl @karalabe that still doesn't address the problem of multiple configurations. |
8383a5b
to
6f0b966
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EVMC should be loaded earlier, maybe on geth startup
This is addressed by having vm.InitEVMC()
function. This is quite simple solution, but we have to make sure this is called in every tool before EVMC can be used. Please comment if the places I put vm.InitEVMC()
invocations are right.
I believe I addressed all the previously reported issues.
Some refactoring around EVMC initialization should simplify code a lot. I'm starting doing it right now, so please tell me if this is good direction asap.
return output, err | ||
} | ||
|
||
func (evm *EVMC) CanRun(code []byte) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code segment will hijack EVM too, even if I only want to configure EVMC for eWASM. Please make the EVM specializable to either eWASM or EVM, and hard select one or the other based on what it was constructed for.
This has been addressed. Now we have independent module instances for EVM and Ewasm.
Changes from my side are finished at this point. Waiting for review. |
// getRevision translates ChainConfig's HF block information into EVMC revision. | ||
func getRevision(env *EVM) evmc.Revision { | ||
n := env.BlockNumber | ||
conf := env.ChainConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there Istanbul here yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meanwhile, EVMC v6 doesn't offer proper Istanbul support, so better to leave it out :)
panic("EVMC VM path not provided, set --vm.(evm|ewasm)=/path/to/vm") | ||
} | ||
|
||
instance, err := evmc.Load(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use ethereum/evmc#404 here.
As for moving this forward, this comment from @karalabe is still outstanding:
|
BTW, do you want to keep the |
My plan is the following: The EVMC 7 will require more information about storage change status (flat list of state codes): ethereum/evmc#418. There are 9 of them in total. This will allow computing refunds on the EVM side. But to figure out the state change the code of similar complexity is needed. I propose to extract this logic into a separate function and use it in the main geth implementation, but also output the EVMC state change code - so the same function could be used also in EVMC integration part. That's the best idea I have for this issue. |
This PR is very stale by now, and as far as I know is not actively worked on, nor is it part of our roadmap going forward. I'll close this for now, since it'll likely not get merged. |
Previous iterations of these PR include:
Usage
You can download an Aleth release https://github.com/ethereum/aleth/releases. It contains the EVMC compatible libaleth-interpreter.so
To start geth with aleth interpreter:
Testing