-
Notifications
You must be signed in to change notification settings - Fork 771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider patching memset
/memcpy
inside of WASM to call the native implementations instead
#21
Comments
This is a difficult decision: This would fix our short term issues with wasmi and is very tempting because it would probably be implementable right away. However, it directly opposes our goal of moving away from the deprecated I generally like the idea and I am not against it. But let me play devils advocate and see how you feel about those drawbacks:
|
Indeed. But due to the way
If they're inlined it means they were copying something small enough that most likely this optimization would be a net loss.
Yeah, this is a good point. From what I can see
We could always add a simple test in our
We could probably just write a benchmark which would go through each possible size up to N and see where the cutoff is and then instead of emitting an unconditional call add an One problem I could see is that if we're going to use |
The size of the object shouldn't affect the inlining decision. The functions are always of the same size. Something similar happens: I think for objects smaller than a certain thresholds no call to this function is injected but an unrolled loop. In this case we don't want to patch as you said. We are good here.
We still want wasm-opt. It does a lot of optimizations. I think we should just run our pass before wasm-opt. Then there is nothing to inline for wasm-opt. |
We can't really do that because So if we'd want this and wasm-opt we'd have to instruct it to leave these functions alone (assuming it'll touch them). |
Why can't we just put it into the |
The whole point of this exercise is to make this optimization entirely transparent, with full backwards and forwards compatibility, and also to make it apply retroactively so that every runtime which already exists on chain would also automatically benefit (but that's just a bonus). We can't do this in On other hand with this approach we could most likely enable it right now. (And we can still remove this and switch to real bulk memory ops in the future once all of the pieces are there, precisely because it doesn't affect the runtime <-> host interface at all.)
Well, any JIT optimizer does this a lot, including |
Ahh yeah I forgot about this. Yes it needs to be done when compiling then.
Yes but this is expected behavior. I don't expect this from my blockchain node. But its okay. It seems to be the only viable way.
Only grumble is that I don't see a way to tell wasm-opt which functions to leave alone for inlining. |
Yeah, if we want both then this will have to be rectified. But assuming the worst case even if I have to add this to |
Since there is no time frame to move away from stack height metering from Are you up for doing it? |
If I were to do this in a production-ready fashion I'd probably like to also do #20 (that is, get rid of So, tentatively yeah. Right now I'm putting up a proof of concept implementation so that I can measure the performance benefit and can compare vanilla vs this optimization vs |
I didn't know. This is great. After that the only blocker for wasm-proposals is the versioning of the executor.
Nice. |
… import message (#4021) Sometimes you need to debug some issues just by the logs and reconstruct what happened. In these scenarios it would be nice to know if a block was imported as best block, and what it parent was. So here I propose to change the output of the informant to this: ``` 2024-04-05 20:38:22.004 INFO ⋮substrate: [Parachain] ✨ Imported #18 (0xe7b3…4555 -> 0xbd6f…ced7) 2024-04-05 20:38:24.005 INFO ⋮substrate: [Parachain] ✨ Imported #19 (0xbd6f…ced7 -> 0x4dd0…d81f) 2024-04-05 20:38:24.011 INFO ⋮substrate: [jobless-children-5352] 🌟 Imported #42 (0xed2e…27fc -> 0x718f…f30e) 2024-04-05 20:38:26.005 INFO ⋮substrate: [Parachain] ✨ Imported #20 (0x4dd0…d81f -> 0x6e85…e2b8) 2024-04-05 20:38:28.004 INFO ⋮substrate: [Parachain] 🌟 Imported #21 (0x6e85…e2b8 -> 0xad53…2a97) 2024-04-05 20:38:30.004 INFO ⋮substrate: [Parachain] 🌟 Imported #22 (0xad53…2a97 -> 0xa874…890f) ``` --------- Co-authored-by: Bastian Köcher <git@kchr.de>
… import message (#4021) Sometimes you need to debug some issues just by the logs and reconstruct what happened. In these scenarios it would be nice to know if a block was imported as best block, and what it parent was. So here I propose to change the output of the informant to this: ``` 2024-04-05 20:38:22.004 INFO ⋮substrate: [Parachain] ✨ Imported #18 (0xe7b3…4555 -> 0xbd6f…ced7) 2024-04-05 20:38:24.005 INFO ⋮substrate: [Parachain] ✨ Imported #19 (0xbd6f…ced7 -> 0x4dd0…d81f) 2024-04-05 20:38:24.011 INFO ⋮substrate: [jobless-children-5352] 🌟 Imported #42 (0xed2e…27fc -> 0x718f…f30e) 2024-04-05 20:38:26.005 INFO ⋮substrate: [Parachain] ✨ Imported #20 (0x4dd0…d81f -> 0x6e85…e2b8) 2024-04-05 20:38:28.004 INFO ⋮substrate: [Parachain] 🌟 Imported #21 (0x6e85…e2b8 -> 0xad53…2a97) 2024-04-05 20:38:30.004 INFO ⋮substrate: [Parachain] 🌟 Imported #22 (0xad53…2a97 -> 0xa874…890f) ``` --------- Co-authored-by: Bastian Köcher <git@kchr.de>
… import message (paritytech#4021) Sometimes you need to debug some issues just by the logs and reconstruct what happened. In these scenarios it would be nice to know if a block was imported as best block, and what it parent was. So here I propose to change the output of the informant to this: ``` 2024-04-05 20:38:22.004 INFO ⋮substrate: [Parachain] ✨ Imported paritytech#18 (0xe7b3…4555 -> 0xbd6f…ced7) 2024-04-05 20:38:24.005 INFO ⋮substrate: [Parachain] ✨ Imported paritytech#19 (0xbd6f…ced7 -> 0x4dd0…d81f) 2024-04-05 20:38:24.011 INFO ⋮substrate: [jobless-children-5352] 🌟 Imported paritytech#42 (0xed2e…27fc -> 0x718f…f30e) 2024-04-05 20:38:26.005 INFO ⋮substrate: [Parachain] ✨ Imported paritytech#20 (0x4dd0…d81f -> 0x6e85…e2b8) 2024-04-05 20:38:28.004 INFO ⋮substrate: [Parachain] 🌟 Imported paritytech#21 (0x6e85…e2b8 -> 0xad53…2a97) 2024-04-05 20:38:30.004 INFO ⋮substrate: [Parachain] 🌟 Imported paritytech#22 (0xad53…2a97 -> 0xa874…890f) ``` --------- Co-authored-by: Bastian Köcher <git@kchr.de>
* Update dependencies Upgrades Substrate based dependencies from v2.0.0 -> v2.0.0-alpha.1 and uses the `jsonrpsee`'s new feature flags. The actual code hasn't been updated though, so this won't compile. * Use `RawClient`s from `jsonrpsee` * Update to use jsonrpsee's new API * Hook up Ethereum Bridge Runtime, Relay, and Node Runtime * Bump `parity-crypto` from v0.4 to v0.6 Fixes error when trying to compile tests. This was caused by `parity-crypto` v0.4's use of `parity-secp256k1` over `secp256k1'. Using the Parity fork meant multiple version of the same underlying C library were being pulled in. `parity-crypto` v0.6 moved away from this, only relying on `secp256k1` thus fixing the issue.
So I've been thinking about how to potentially speed up the contracts pallet, and I've remembered that just by enabling the bulk memory operations WASM feature we could potentially make the contracts pallet twice as fast (and also potentially speed up other things, as bulk memory operations are very widely used). Enabling this is unfortunately a little tricky, as outlined by @pepyakin here.
But then, I got a (maybe not so) crazy idea. What if we'd just.... monkeypatch this into the WASM bytecode before compiling it?
We already patch the WASM blobs to adjust e.g. the memory section or add stack depth metering. So why not do something like this?
memset
andmemcpy
inside of the WASM blob.memset
/memcpy
on the host.So this has the following benefits:
wasmtime
bug where depending onthe phase of the moonwhether the Cranelift-generatedmemset
loop randomly ends up cache aligned in memory or not the contracts pallet's performance takes a sharp nosedive.Any downsides here that I'm not seeing as to why we wouldn't want to do this?
cc @paritytech/sdk-node @pepyakin @athei
The text was updated successfully, but these errors were encountered: