-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of Cache::analyze
#2051
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙌
That's surprising to me. Loading the file from disk is much faster than hashing it? Strange. But we probably talk about very small values at this point anyways. |
cd19735
to
8b1f750
Compare
I am not quite sure how this change could even have impacted the |
I think what is happening here is that the corrupted.wasm now returns a different error because the order of checks in check_wasm changed. I think we can just update the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Could you add a CHANGELOG.md entry to the Unreleased section, that briefly explains what we did here and links the PR?
Okay, pushed on wrong branch. One sec. |
Improve the performance of the
Cache::analyze
function.When analyzing the current benchmarks using
perf
, you'll see that most of the time is spent inside theFuncValidator::valdiate
function.This PR changes the logic to only validate the function bodies when the WASM is saved into the cache since we already assume that the cache only contains modules which check all the validation boxes.
Now the largest timespan spent is on the SHA-256 checksum generation.
Another minor change is that the function body validation is now done in parallel which will mostly benefit WASM modules with a lot of functions.
The downside of this is that the allocations can't be reused since that would imply shared mutable ownership across threads, and using a mutex here would just make this code essentially sequential again.
But there is still an ~15% performance gain over not doing it in parallel on average in my non-reliable benchmarks (they were performed on my workstation with rust-analyzer, Firefox, etc. open).
Closes #2033