Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for EVMC #17954

Closed
wants to merge 1 commit into from
Closed

Support for EVMC #17954

wants to merge 1 commit into from

Conversation

chfast
Copy link
Member

@chfast chfast commented Oct 21, 2018

Previous iterations of these PR include:

Usage

You can download an Aleth release https://github.com/ethereum/aleth/releases. It contains the EVMC compatible libaleth-interpreter.so

To start geth with aleth interpreter:

geth --vm.evm=./libaleth-interpreter.so

Testing

go test ./tests/... -run TestState -vm.evm ./libaleth-interpreter.so

@gballet gballet self-requested a review October 22, 2018 07:10
@gballet gballet self-assigned this Oct 22, 2018
Copy link
Member

@gballet gballet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, would just like to clarify a few minor points.

core/vm/evm.go Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
Copy link
Member

@gballet gballet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a minor series of questions

core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evm.go Show resolved Hide resolved
@chfast
Copy link
Member Author

chfast commented Jan 4, 2019

It's new year. Maybe it's time to finish this?

@chfast
Copy link
Member Author

chfast commented Jan 15, 2019

Update:

  • Rebased on top of v1.8.20.
  • The --vm.evm flag added to evm tool.

@axic
Copy link
Member

axic commented Feb 5, 2019

@chfast @gballet do we need another round of reviews here? 😉

@chfast
Copy link
Member Author

chfast commented Feb 14, 2019

Updated to EVMC 6.1.1. The ConstantinopleFix is broken. Investigating... False alarm, I accidentally used wrong aleth-interpreter binary.

Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR in general looks ok, minor nits here and there. The three problematic parts are:

  • A lot of sensitive consensus things are duplicated. That will bite back later when we need to modify them and won't be able to validate that EVMC still works correctly. Can we somehow unify the two implementations wrt gas prices, refunds, existence checks, etc?
  • We must be able to construct multiple versions of EVMC, both with the same configs and with different configs to support concurrent executions.
  • When creating an EVMC instance, I must be able to tell it whether to run in eWASM or EVM mode of operation. It must not hijack VM execution that is was not assigned. The current code will take away EVM execution from the built in VM even if I only enabled EVMC for eWASM.

core/vm/evm.go Show resolved Hide resolved
core/vm/evm.go Outdated Show resolved Hide resolved
core/vm/evm.go Outdated Show resolved Hide resolved
core/vm/evmc.go Show resolved Hide resolved
createMu sync.Mutex
evmcConfig string // The configuration the instance was created with.
evmcInstance *evmc.Instance
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This global instance seems very dangerous long term (e.g. what happens if I want both execution and tracing with different options running concurrently? or block processing and mining concurrently? or concurrent transaction execution?).

How heavy is creating an EVMC instance? Do we have some numbers on it? Can't we create it on the fly when needed? Perhaps we could have a pool of them if they are expensive?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally if we could have some sort of a map[config][]evmcInstance (just an example for the idea). Then we could cover multiple code paths executing either the same or different EVMC instances.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest is possibly a sync.Map (thread safe) containing sync.Pool objects (thread safe + auto GC) containing evmc.Instance objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of global objects either, but it might be unavoidable in this case.

Let me start with some overview. First of all, the cost of creating an EVMC instance for interpreters is literally 0. They simply return their name and the ABI version. This is more expensive for EVMJIT where LLVM requires some global initialization to be performed. But anyway, the evmcInstance is something else than an instance of thje Interpreter. The evmcInstance does not have any execution context attached to it. The execution context of a VM is hidden and handled internally by the Execute() method which can be called concurrently. This also mean that e.g. you cannot inspect if an EVMC instance is in static mode currently. It simply represents a handle of dynamically loaded EVMC module.

That's why I think this is unavoidable for DLLs (you don't want to call dlopen for every contract execution) especially if they are configured with CLI options. There will be some maintenance cost related to that.

Configuration options for EVMC make it even more complicated. However, I'm leaning toward a decision to allow passing EVMC options only when an EVMC instance is created. If you need different options you need to create another instance. Moreover, tracing will be probably handled differently (not by EVMC options).

In this PR, you can only create single EVM and single EWASM instances so we have 2 different instances at most. I can try to use sync.Map as a start (it should at least reduce the number of panic() occurrences). Let me know what you think about it.

core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
return output, err
}

func (evm *EVMC) CanRun(code []byte) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code segment will hijack EVM too, even if I only want to configure EVMC for eWASM. Please make the EVM specializable to either eWASM or EVM, and hard select one or the other based on what it was constructed for.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, in practice you can still control this by EVMC options. E.g. for Hera you can set something like evm=off so it will not report the EVM capability.

Secondly, I'm not sure what exactly are you asking for. Something like

func NewEVMC(ewasm bool, ...)

NewEVMC(false, ...)
NewEVMC(true, ...)

If yes, that sounds like a good change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of what Peter wants to say, and to summarize grossly, is that we want to have the equivalent of evm=on. EVMC taking over EVM needs to be a conscious choice by the user, not the user.

Here, the default behavior of an EVMC backend client is to grab EVM code as soon as --vm.ewasm='libhera.so' is set. We want the request for CapabilityEVM1 to only occur if the client expressedly said --vm.evm='libhera.so' or something similar on the command line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code segment will hijack EVM too, even if I only want to configure EVMC for eWASM. Please make the EVM specializable to either eWASM or EVM, and hard select one or the other based on what it was constructed for.

This has been addressed. Now we have independent module instances for EVM and Ewasm.

@karalabe
Copy link
Member

@gballet had a nice idea that we should probably move the sstore gas tricks into EVMC itself and just expose AddRefund and family. Otherwise the whole gas calculation in the EMVC Go wrapper is really consensus logic that should have been handled by EVMC itself, just leaked into Geth.

@gballet
Copy link
Member

gballet commented Feb 27, 2019

List of things I have to do:

Copy link
Member Author

@chfast chfast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for detailed review. I handled the most of coding style issues and easy fixes.
With other changes I will probably need ask some additional questions so expect them soon.

createMu sync.Mutex
evmcConfig string // The configuration the instance was created with.
evmcInstance *evmc.Instance
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of global objects either, but it might be unavoidable in this case.

Let me start with some overview. First of all, the cost of creating an EVMC instance for interpreters is literally 0. They simply return their name and the ABI version. This is more expensive for EVMJIT where LLVM requires some global initialization to be performed. But anyway, the evmcInstance is something else than an instance of thje Interpreter. The evmcInstance does not have any execution context attached to it. The execution context of a VM is hidden and handled internally by the Execute() method which can be called concurrently. This also mean that e.g. you cannot inspect if an EVMC instance is in static mode currently. It simply represents a handle of dynamically loaded EVMC module.

That's why I think this is unavoidable for DLLs (you don't want to call dlopen for every contract execution) especially if they are configured with CLI options. There will be some maintenance cost related to that.

Configuration options for EVMC make it even more complicated. However, I'm leaning toward a decision to allow passing EVMC options only when an EVMC instance is created. If you need different options you need to create another instance. Moreover, tracing will be probably handled differently (not by EVMC options).

In this PR, you can only create single EVM and single EWASM instances so we have 2 different instances at most. I can try to use sync.Map as a start (it should at least reduce the number of panic() occurrences). Let me know what you think about it.

core/vm/evmc.go Outdated Show resolved Hide resolved
core/vm/evmc.go Outdated Show resolved Hide resolved
return output, err
}

func (evm *EVMC) CanRun(code []byte) bool {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, in practice you can still control this by EVMC options. E.g. for Hera you can set something like evm=off so it will not report the EVM capability.

Secondly, I'm not sure what exactly are you asking for. Something like

func NewEVMC(ewasm bool, ...)

NewEVMC(false, ...)
NewEVMC(true, ...)

If yes, that sounds like a good change.

@chfast
Copy link
Member Author

chfast commented Feb 27, 2019

@gballet had a nice idea that we should probably move the sstore gas tricks into EVMC itself and just expose AddRefund and family. Otherwise the whole gas calculation in the EMVC Go wrapper is really consensus logic that should have been handled by EVMC itself, just leaked into Geth.

Yes, this is planned for EVMC7. It was mistake to leave refund calculation responsibility to the Host. The Constantinople changes showed the problem clearly. So I don't think this can be addressed imminently, but my plan is to return the "refund counter" next to "gas left" in the execution result struct. Question: should it return already aggregated refund from subcalls?

About NewEVM() returning error #19175:
If this is only needed for EVMC, I think there are better way of handling this.
As I tried to explain in the inline comment #17954 (comment), errors can only happen because of initial EVMC loading and configuration. I think it would be easier to have it under control if EVMC is loaded and configured at startup. It should be easier to handle those errors imminently and then NewEVMC would be error-free.

It also improves UX. Currently, the EVMC is loaded on first use, which can happen quite late (in case of sync on the first TX execution which can be after hours). Users probably what to know earlier that the path they provided to the EVMC shared library is wrong.

@gballet
Copy link
Member

gballet commented Feb 28, 2019

@gballet had a nice idea that we should probably move the sstore gas tricks into EVMC itself and just expose AddRefund and family. Otherwise the whole gas calculation in the EMVC Go wrapper is really consensus logic that should have been handled by EVMC itself, just leaked into Geth.

Yes, this is planned for EVMC7. It was mistake to leave refund calculation responsibility to the Host. The Constantinople changes showed the problem clearly. So I don't think this can be addressed imminently, but my plan is to return the "refund counter" next to "gas left" in the execution result struct. Question: should it return already aggregated refund from subcalls?

The potential issue that I foresee is with error handling: making sure that the correct gas is accounted for in case of an exception in the external library might quickly become a management hell, with all these sub-sub-sub-cases. But for all practical purposes, I don't think it makes any difference on the geth side.

About NewEVM() returning error #19175:
If this is only needed for EVMC, I think there are better way of handling this.
As I tried to explain in the inline comment #17954 (comment), errors can only happen because of initial EVMC loading and configuration. I think it would be easier to have it under control if EVMC is loaded and configured at startup. It should be easier to handle those errors imminently and then NewEVMC would be error-free.

That is certainly desirable, however how to you make sure that you can pre-instantiate every configuration? Also, some VMs (i.e. wagon) are stateful so you can't really instantiate it once and for all. You could only pre-instantiate the interpreter, then perform VM allocation when it's needed. But in any case loading could fail based on the content of the contract. so returning an error makes sense. It's not the only solution (the other one being to load the contract at a later stage) but it's the less intrusive one.

@fjl
Copy link
Contributor

fjl commented Feb 28, 2019

Results from short review call with @karalabe:

  • global EVMC instance is OK
  • duplicating refunds code is OK for now if it is cleaned up later
  • EVMC should be loaded earlier, maybe on geth startup
  • the wrapper code needs to support setting where EVMC is only used for eWASM but not EVM

@gballet
Copy link
Member

gballet commented Feb 28, 2019

Results from short review call with @karalabe:

* global EVMC instance is OK

* duplicating refunds code is OK for now if it is cleaned up later

* EVMC should be loaded earlier, maybe on geth startup

* the wrapper code needs to support setting where EVMC is only used for eWASM but not EVM

@fjl @karalabe that still doesn't address the problem of multiple configurations.

@chfast chfast force-pushed the evmc-v6 branch 2 times, most recently from 8383a5b to 6f0b966 Compare May 22, 2019 12:45
Copy link
Member Author

@chfast chfast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EVMC should be loaded earlier, maybe on geth startup

This is addressed by having vm.InitEVMC() function. This is quite simple solution, but we have to make sure this is called in every tool before EVMC can be used. Please comment if the places I put vm.InitEVMC() invocations are right.

I believe I addressed all the previously reported issues.

Some refactoring around EVMC initialization should simplify code a lot. I'm starting doing it right now, so please tell me if this is good direction asap.

return output, err
}

func (evm *EVMC) CanRun(code []byte) bool {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code segment will hijack EVM too, even if I only want to configure EVMC for eWASM. Please make the EVM specializable to either eWASM or EVM, and hard select one or the other based on what it was constructed for.

This has been addressed. Now we have independent module instances for EVM and Ewasm.

@chfast
Copy link
Member Author

chfast commented Jul 3, 2019

Changes from my side are finished at this point. Waiting for review.

// getRevision translates ChainConfig's HF block information into EVMC revision.
func getRevision(env *EVM) evmc.Revision {
n := env.BlockNumber
conf := env.ChainConfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there Istanbul here yet?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meanwhile, EVMC v6 doesn't offer proper Istanbul support, so better to leave it out :)

@axic
Copy link
Member

axic commented Aug 12, 2019

@gballet @karalabe @chfast would making this version of the PR EVM-only remove some risks and accelerate the merging process?

panic("EVMC VM path not provided, set --vm.(evm|ewasm)=/path/to/vm")
}

instance, err := evmc.Load(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use ethereum/evmc#404 here.

@holiman
Copy link
Contributor

holiman commented Sep 2, 2019

As for moving this forward, this comment from @karalabe is still outstanding:

  • A lot of sensitive consensus things are duplicated. That will bite back later when we need to modify them and won't be able to validate that EVMC still works correctly. Can we somehow unify the two implementations wrt gas prices, refunds, existence checks, etc?

@chfast
Copy link
Member Author

chfast commented Sep 2, 2019

As for moving this forward, this comment from @karalabe is still outstanding:

* A lot of sensitive consensus things are duplicated. That will bite back later when we need to modify them and won't be able to validate that EVMC still works correctly. Can we somehow unify the two implementations wrt gas prices, refunds, existence checks, etc?
* duplicating refunds code is OK for now if it is cleaned up later

BTW, do you want to keep the --vm.ewasm option?

@chfast
Copy link
Member Author

chfast commented Oct 29, 2019

* A lot of sensitive consensus things are duplicated. That will bite back later when we need to modify them and won't be able to validate that EVMC still works correctly. Can we somehow unify the two implementations wrt gas prices, refunds, existence checks, etc?

My plan is the following:

The EVMC 7 will require more information about storage change status (flat list of state codes): ethereum/evmc#418. There are 9 of them in total. This will allow computing refunds on the EVM side. But to figure out the state change the code of similar complexity is needed. I propose to extract this logic into a separate function and use it in the main geth implementation, but also output the EVMC state change code - so the same function could be used also in EVMC integration part. That's the best idea I have for this issue.

@chfast chfast changed the title Support for EVMC 6 Support for EVMC May 20, 2020
@holiman
Copy link
Contributor

holiman commented Oct 29, 2021

This PR is very stale by now, and as far as I know is not actively worked on, nor is it part of our roadmap going forward. I'll close this for now, since it'll likely not get merged.

@holiman holiman closed this Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants