Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EVM methods for encodeABI & decodeABI #5024

Merged
merged 27 commits into from
Dec 13, 2023

Conversation

m-Peter
Copy link
Collaborator

@m-Peter m-Peter commented Nov 15, 2023

Closes #4937

Introduces two natively-implemented methods on the EVM contract, for encoding/decoding ABI:

  • EVM.encodeABI
  • EVM.decodeABI

Note: Some stuff are still to be decided.

Copy link
Member

@turbolent turbolent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

As discussed on Discord, it might be nice to "future-proof" the Cadence API for the EVM ABI, so that it can be easily extended without breaking changes in the future.

Proposal: Make the API more object-oriented:

  • Introduce a Cadence type for the ABI, and allow creating an instance of it from a JSON string. Later this can be extended to allow constructing an ABI without the JSON.
  • Refactor the current encodeABI/decodeABI functions of the EVM contract into functions of the ABI type. This can be later used to encode arguments even for an ABI that was not constructed from JSON.

This basically reflects geth's API, i.e. abi.ABI. As mentioned above, later we could introduce ABI types like EVM.Method, EVM.Event, etc. and have getter functions on the ABI type (again, similar to the Go API).

Concretely, maybe we can have something like

contract EVM {

    // ...

    struct ABI {
        fun encode(arguments: [AnyStruct]): [UInt8] { /* ... */ }
        fun decode(data: [UInt8]): [AnyStruct] { /* ... */ }
    }

    fun abiFromJSON(): ABI { /* ... */ }

This is just an idea, and maybe out of scope for the MVP, happy to get this in and we can discuss the re-design and implement it if necessary in a follow-up PR.

@codecov-commenter
Copy link

codecov-commenter commented Nov 15, 2023

Codecov Report

Attention: 79 lines in your changes are missing coverage. Please review.

Comparison is base (cb152a0) 56.39% compared to head (5b16ed6) 56.53%.

Files Patch % Lines
fvm/evm/stdlib/contract.go 85.50% 48 Missing and 31 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5024      +/-   ##
==========================================
+ Coverage   56.39%   56.53%   +0.13%     
==========================================
  Files         977      977              
  Lines       91987    92529     +542     
==========================================
+ Hits        51877    52310     +433     
- Misses      36268    36341      +73     
- Partials     3842     3878      +36     
Flag Coverage Δ
unittests 56.53% <85.50%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@ramtinms ramtinms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, we just need to add more comprehensive computation metering here,

Maybe not 100% related to this PR but, I'm a bit confused about the application, if the encoding data is known in advance and this is a pure function, why do we need to do it on chain?

@turbolent
Copy link
Member

@franklywatson Do you have any more information as to how cross-chain integration work needs the Ethereum ABI encoding and decoding functions to be available on-chain?

@ryan-suematsu-anchain
Copy link

ryan-suematsu-anchain commented Nov 16, 2023

@franklywatson Do you have any more information as to how cross-chain integration work needs the Ethereum ABI encoding and decoding functions to be available on-chain?

Hey @turbolent 👋 ,

I am one of the devs for the axelar-flow bridge. Heres a quick run down of ABI encode/decode and why we need it for our use case.

So EVM frequently uses encode/decode abi to pass data between on chain smart contracts and dapps, sometimes thats a way to return dynamic data, or pass in dynamic function args to functions. It can be thought of as a way to 'stringify'/serialize your data in a portable format.

In our case we need it to pass dynamic data into dapp functions after the data has been signed by the axelar network. The flow goes something like this,

  • the axelar network validates a GMP (function call)

  • the axelar network then signs the gmp data by creating a hash of the input data. This input data is something like, (destinationAddr, sourceAddr, destinationChain, sourceChain, payload). (The payload is what is key here). This payload is generated by the source dapp, often created on the corresponding chain.

    • This payload needs to be a string of data that can be signed off chain, passed on chain, then parsed to its required arguments.
  • This hash gets passed to the cross chain service, where we can validate the hash was signed by the correct individuals and we can recreate the hash to verify that the data passed in was indeed the data that was signed.

  • ...some internal logic happens to approve the call

  • When that call to the destination contract (on flow) is made, we pass in the payload as well and do the same logic checks. to ensure that the payload and all concerning data was included in the signed hash.

  • after that is verified, the payload is then passed to the destination function, where the data can be decoded/parsed into its required format.

its crucial that we use some sort of arbitrary serialization technique for the payload, as we want to allow the destination dapp to be able to pass in as many custom arguments as required.
its also good to note, that EVM doesnt use something like json parsing and stringifying because string manipulation is very expensive on chain.


encodedABI := cadence.NewArray(
[]cadence.Value{
cadence.UInt8(0xef), cadence.UInt8(0x84), cadence.UInt8(0xad), cadence.UInt8(0x63),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe write a function that expands the values, so something like:
pseudo:

cadenceUInt8(vals ...cadence.UIn8) []cadence.UInt8 {
 for 32 - len(vals) ->  append vals []UInt8{0x0}
 return vals
}

so you generate extended array

@turbolent
Copy link
Member

@ryan-suematsu-anchain Thank you for the great write-up Ryan!

Just to confirm, would the functions as proposed/implemented in the PR be useful for you? If I understand you correctly, at the end of your comment you're saying that passing the ABI to the chain to encode values to data, or decode encoded data to values, is uncommon (not done at all?).

In the last point, you mention "the data can be decoded/parsed into its required format". Would that be the place where the ABI decoding function would be called?

@franklywatson
Copy link
Contributor

franklywatson commented Nov 17, 2023

@turbolent was reading your comment "you're saying that passing the ABI to the chain to encode values to data, or decode encoded data to values, is uncommon (not done at all?)". The idea that because it's a rarely used set of functionality it's maybe not that useful isn't the right question here IMO.

To my understanding, the need for the functionality arises from the fact that Flow's Axelar gateway contracts are merely one node on the cross-chain network messaging protocol. The expectations of the protocol are that participating gateways must support ABI.encode/decode since it does get used (albeit infrequently) and because it's decentralized Flow needs to be at parity with the rest of the network.

@ryan-suematsu-anchain please correct me if I'm wrong and/or to add more color, will leave Bastian's final point for you

@ryan-suematsu-anchain
Copy link

@franklywatson is right, I think its an important feature to have not only for us but other cross chain applications looking to build for Flow/EVM cross chain solutions.

@turbolent I think there might be a misunderstanding regarding the difference between something like Json.parse/stringify and ABI.encode/decode. In EVM the abi encoding takes advantage of alot of optimizations since its a byte encoding format, where as a json stringify/parse format uses string manipulation and often avoided because of the high gas cost that comes with it. I would say ABI encode/decode is pretty frequently in various EVM cases such as, complicated return structs, passing in dynamic lengths of multiple args, and cross chain apps!

"Just to confirm, would the functions as proposed/implemented in the PR be useful for you?"
Yes!!

"In the last point, you mention "the data can be decoded/parsed into its required format". Would that be the place where the ABI decoding function would be called?"
Yes!!

@turbolent
Copy link
Member

turbolent commented Nov 21, 2023

@franklywatson @ryan-suematsu-anchain Thank you for the additional info and answers 👍

Follow-up question: Given that encoding and decoding is useful on-chain, looking at e.g. https://www.wtf.academy/en/solidity-advanced/ABIEncode/, how common is it to pass the ABI for a whole contract on-chain , and use it to encode/decode, vs. what the linked page describes, encoding arguments / decoding encoded arguments given a single function signature?

For example, we might also want to have, or even only have, encode/decode functions like, e.g.

contract EVM {
  fun encodeWithSignatur(_ signature: String, arguments: [AnyStruct]): [UInt8] { /* ... */ }
  fun decodeWithSignature(_ signature: String, data: [UInt8]): [AnyStruct] { /* ... */ }
}

let data = EVM.encodeWithSignature("foo(uint256,string)", arguments: [42, "bar"])
let arguments = EVM.decodeWithSignature("foo(uint256,string)", data: data)

Do we need/want packed encoding?

arguments = append(arguments, uint64(value))
}
}
packed, err := abi.Pack("details", arguments...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I understand the ABI encoding/decoding more: Doesn't this hardcode the method name to "details"? Shouldn't the method name be passed to the encodeABI (and decodeABI) function, i.e. should those functions have an additional parameter?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/axelarnetwork/axelar-gmp-sdk-solidity/blob/66325132bde3f61d2c18748cfcd3e181b063ea00/contracts/test/mocks/MockGateway.sol#L224C32-L224C32

Here is a good example of how we plan to use ABI encode/decode on chain. Mainly passing in a 'partial' abi to encode only those variables into bytes. I would say encode packed is less important of a feature as it seems to do everything normal encode/decode does in less space. It may be good to support in a future roadmap. I think the functions you outlined are a good place to start. It would allow developers to create data packages that would be portable between flow and evm as well as generate EVM function calls/signatures on flow.

@turbolent
Copy link
Member

turbolent commented Dec 1, 2023

We had a call about several details of the API

Functionality

Solidity provides several different functions for encoding and decoding data (e.g. see https://www.wtf.academy/en/solidity-advanced/ABIEncode/). For example, it supports

  1. Encoding of just multiple values, either "packed" (shorter) or unpacked (abi.encode and abi.encodePacked)
  2. Encoding of function call, by passing function name and arguments (abi.encodeWithSignature)

To start with, we need at least 1), and the packed variant is desirable.
In the future we might also want to consider adding Cadence equivalents of the other ABI encoding/decoding functions.

If the Cadence encoding function produces a packed encoding, it should probably indicate that in the name, e.g encodePacked, so that later we can also add e.g. a encoding function for the unpacked variant.

Cadence API

For example, the Cadence API could make use of type parameters, and similar to the Solidity function, take all values through a variable-sized parameter list (straw man syntax, as Cadence syntax does not allow such functions to be written):

fun encodePacked<T1, ..., TN>(_ v1: T1, ..., _ vN: TN): [UInt8]

Essentially, this function has a variable length of parameters / accepts an arbitrary number of arguments, and has one type parameter for each (value) parameter.

This would allow a developer to simply call the function with any arguments, and the type checker would infer the type parameters automatically:

For example, this function call would encode an Int and a String (inferred from the arguments):

let data = encodePacked(42, "bar")

If however the user wanted to encode the integer literal as a UInt256, they could do so by either asserting the type in the argument expression, or enforce the type in the type parameter list:

let data = encodePacked(42 as UInt256, "bar")
let data = encodePacked<UInt256, String>(42, "bar")

Similarly, the decoding function could look like:

fun decode<T1, ..., TN>(data: [UInt8]): [U]

Note that given that Cadence does not (yet) support multiple return values or tuples, the return type would be [U], where U is the least-common supertype of all decoded types (T1, ..., TN).

A function call could look like:

let values = decode<UInt256, String>(data)   // `values` has type `[AnyStruct]`

Location

It probably makes sense to have these ABI encoding/decoding functions be part of the EVM contract. They could also be part of a new/different contract.

Given that the functions are currently implemented natively (Go), using the existing setup for injecting the native functions into the EVM contract (via the InternalEVM contract), "reusing"/working on top of that code is probably easiest.

Mapping between Cadence and Solidity Types

The mapping between Solidity types and Cadence types could look like

  • bool ↔️ Bool
  • string ↔️ String
  • Integer types: uint256 ↔️ UInt256. For sizes where there is no Cadence equivalent: maybe just Int or UInt?
  • Array
  • Address:
    • address decodes to EVM.EVMAddres
    • Allow encoding of String and EVM.EVMAddres to address

Metering

All computation performed by encoding and decoding needs to be metered. That means that the native (Go) code must be annotated with computation metering calls. The metering needs to handle containers (arrays).

Example: RLP encoding/decoding

@m-Peter m-Peter force-pushed the abi-encoding-decoding branch 2 times, most recently from bc98b03 to 0b25342 Compare December 3, 2023 19:20
@m-Peter
Copy link
Collaborator Author

m-Peter commented Dec 4, 2023

Following the sync on Friday and the suggestions in #5024 (comment),
I have updated the EVM.encodeABI & EVM.decodeABI methods to operate on variables, e.g.:

// Encode & decode `String`, `UInt64` and `Bool`
var data = EVM.encodeABI(["John Doe", UInt64(33), false])
let types = [Type<String>(), Type<UInt64>(), Type<Bool>()]
var values = EVM.decodeABI(types, data: data)

// Assert the decoded values 
assert(values.length == 3)
assert((values[0] as! String) == "John Doe")
assert((values[1] as! UInt64) == UInt64(33))
assert((values[2] as! Bool) == false)

// Encode & decode `EVM.EVMAddress`
let address = EVM.EVMAddress(
    bytes: [122, 88, 192, 190, 114, 190, 33, 139, 65, 198, 8, 183, 254, 124, 91, 182, 48, 115, 108, 113]
)
data = EVM.encodeABI([address])
values = EVM.decodeABI([Type<EVM.EVMAddress>()], data: data)

// Assert the decoded values
assert(values.length == 1)
assert((values[0] as! [UInt8; 20]) == address.bytes)

// Encode & decode a `[UInt64]`
let array: [UInt64] = [5, 6]
data = EVM.encodeABI([array])
values = EVM.decodeABI([Type<[UInt64]>()], data: data)

// Assert the decoded values
assert(values.length == 1)
assert((values[0] as! [UInt64]) == [5, 6])

I was not able to use type parameters for the EVM.decodeABI method, as we can't have arbitrary type parameters for a method, they have to be fixed. Or I guess I misunderstood something in the suggested API.

cc @ryan-suematsu-anchain Can you please have a look on the API of these 2 methods, and their usage? Let me know if that's what you would like 🙏

}
arguments = append(arguments, gethABI.Argument{Type: typ})
case *interpreter.CompositeValue:
if value.QualifiedIdentifier == "EVM.EVMAddress" {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should probably be more bulletproof. Any idea on how to match the desired type EVM.EVMAddress?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this check should take the full type ID (location + qualified identifier) intro account, i.e. also ensure the location of the value is the location of the EVM contract.

That location depends on the network though, because the contract is deployed into the service account, which varies across networks. It's currently passed into SetupEnvironment, so maybe thread it through to the host function here

fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
Copy link
Member

@turbolent turbolent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
}
arguments = append(arguments, gethABI.Argument{Type: typ})
case *interpreter.CompositeValue:
if value.QualifiedIdentifier == "EVM.EVMAddress" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this check should take the full type ID (location + qualified identifier) intro account, i.e. also ensure the location of the value is the location of the EVM contract.

That location depends on the network though, because the contract is deployed into the service account, which varies across networks. It's currently passed into SetupEnvironment, so maybe thread it through to the host function here

fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
Copy link
Member

@turbolent turbolent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
fvm/evm/stdlib/contract.go Show resolved Hide resolved
fvm/evm/stdlib/contract.go Outdated Show resolved Hide resolved
@m-Peter m-Peter force-pushed the abi-encoding-decoding branch from 9d2bd74 to 03761d0 Compare December 8, 2023 13:57
@m-Peter
Copy link
Collaborator Author

m-Peter commented Dec 8, 2023

@turbolent Much thanks for the top-notch reviews on this PR 🙌 They were really helpful.
I think I have addressed almost all of the comments, except for the arbitrary nesting of arrays, e.g. [[UInt8]]. For now I have added support only for arrays of leaf values. Later on, we can generalize this.
The code looks really procedural 😅
I am not sure how to improve it, given the nature of the desired functionality, and all the type-mapping needed. But I am open to suggestions 🙏

@m-Peter m-Peter requested a review from turbolent December 8, 2023 16:25
@m-Peter m-Peter force-pushed the abi-encoding-decoding branch from fbe3d07 to 359e272 Compare December 13, 2023 11:38
@franklywatson franklywatson added this pull request to the merge queue Dec 13, 2023
Merged via the queue into onflow:master with commit ba6f0ff Dec 13, 2023
51 checks passed
@m-Peter m-Peter deleted the abi-encoding-decoding branch December 13, 2023 17:19
}

var arguments gethABI.Arguments
typesArray.Iterate(inter, func(element interpreter.Value) (resume bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need any protection against depth attacks here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #5146

Comment on lines +56 to +57
ComputationKindEVMEncodeABI = 2042
ComputationKindEVMDecodeABI = 2042
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-Peter these two are reusing the same number. we might need different weights for them so we might use different numbers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[EVM] Add support for abi.encode() and abi.decode() to Cadence
7 participants