-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebAssembly size regression between 1.40 and 1.41 #74947
Comments
Could you detail a bit more what's needed to reproduce this regression? For example what in that repo is being built? Additionally would it be possible to minimize the number of tools in play, e.g. only using cargo/rustc? |
@alexcrichton Right, sorry. As I mentioned, it's HQX codec. Basically you need to go to this folder: https://github.com/GoogleChromeLabs/squoosh/tree/dev/codecs/hqx and inside run Alternatively, if you don't want to use Docker here, you can use In terms of tooling, in this case npm invokes Docker, which downloads Rust image and invokes I've shown sizes after |
Just realised I forgot to cc @CryZe (author of HQX crate we're wrapping) who might be also interested in this investigation. |
There's a huge amount of bounds checked indexing going on in that crate, I wouldn't be surprised if track_caller or so is now bloating up the panic location information that is stored in the binary. This is a complete guess though. |
Bisection shows this regression happened between nightly-2019-11-06 and nightly-2019-11-07. Per-merge bisection isn't available since those commits are old enough, but they sure enough contain #65973 which, if there's tons of panics, would indeed cause a regression in binary size. |
Ah interesting. Is there anything that could be done about it / any way to disable for size-optimized builds (hopefully, without going the unstable |
A lot of indexing is done via macros, maybe moving the indexing into a few amount of functions would ensure that there's less location information (which LLVM should then be able to inline away). |
I recall @anp doing some benchmarking on the size regressions and we measured them to be less than 1% on librustc(?) -- but I could definitely see code that indexes more having a greater regression. @anp, did we leave behind any flag to disable the track caller feature? I guess you'd need to recompile libstd regardless, since track_caller is compiled into those artifacts. But maybe we could ship a track-caller-less binary for, say, wasm? |
@RReverser have you enabled console_error_panic_hook or is it disabled? If the feature is disabled, it's possible that the info is all gone. |
Doesn't work, just checked it. See also rustwasm/team#19 |
The mitigation hasn't been implemented yet. IIRC there'd be some nuance to implementing the flag as described in the RFC but it should be straightforward to implement the all-or-nothing version. I added a comment on implementation options on the mitigation issue with more detail if someone's able to pick it up. It would probably be good to confirm that it is indeed locations causing the bloat. It seems very likely but it might be worth confirming with the hacky custom linker section support I tried adding to rustc. |
Further bisection, aka building rustc before/after, shows that #65973 looks to be the cause of this regression. Locally using my own rustc the size of the wasm binary coming out of rustc jumps from 534k to 622k (and libstd has debuginfo enabled). I wasn't able to match the nightlies exactly but they similarly regressed ~100k around those nightlies too. |
I'd imagine it would be useful for any targets that use More generally, could there be a way to tie this feature to the |
Just tried it again, disabling console_error_panic_hook plus running wasm-snip on the un-wasm-opt'd binary (followed by wasm-opt) reduces the wasm file size significantly compared to the baseline. However, there is still a regression between 1.40 and 1.41, probably due to std increases?
|
I didn't even notice that that codec has it enabled by default, we should disable it 🤦♂️ But yeah, as you said, ~same regression remains regardless. Btw, why is there UPD Oh, also note that you're using -O3 not -Os - probably the reason why output files in 1.40 in your experiment are larger than mine, even after wasm-snip. |
Yeah wasm-pack hangs in the wasm-opt step for minutes. Eventually I just aborted it. Files were generated but no idea whether they were functional or not, so I excluded them.
Good point! Personally I prefer -O3 because most times I've experienced that -Os creates barely smaller binaries and in fact sometimes -O3 created smaller ones probably because of some optimization. Same here, there is barely any difference, if there's any at all it has to be in the sub kb range. I'm getting the following sized after replacing -O3 with -Os:
|
It takes up to 20 minutes on my MBP. But the resulting binary is smaller and functional. We actually needed to do this to avoid seeing 100% CPU usage for multiple minutes in Chrome 😅 |
Btw, I think that another reason is that |
Assigning |
I've played a bit and added a non-invasive change to the HQX - CryZe/wasmboy-rs#1 - to work around the code size regression (rust-lang/rust#74947) introduced in the latest Rust. As a side benefit of the change, the build time also went down significantly and now takes only 1 minute altogether - including spawning Docker, fetching Cargo, building Wasm and optimising it with wasm-opt - instead of 15-20 minutes it took before. P.S. h/t @CryZe for a very quick review & publish.
I've played a bit and added a non-invasive change to the HQX - CryZe/wasmboy-rs#1 - to work around the code size regression (rust-lang/rust#74947) introduced in the latest Rust. As a side benefit of the change, the build time also went down significantly and now takes only 1 minute altogether - including spawning Docker, fetching Cargo, building Wasm and optimising it with wasm-opt - instead of 15-20 minutes it took before. P.S. h/t @CryZe for a very quick review & publish.
While upgrading the build configs and compiler versions in squoosh.app in
c5c520a
(#777), we (cc @jakearchibald @surma) have noticed a significant size increase in one of the image codecs - HQX.Normally we ignore fluctuations between versions, as they are expected, but in this particular case the resulting file grew from 219KB to 381KB - by 162KB or 74%.
For now we reverted that codec to the Rust version it was initially built with - 1.39
Meanwhile, I've started investigating and going through Rust versions starting from 1.39 to the latest 1.45 to find the one that introduced the regression. This process is a bit slow, but in the end I've ended up with the following pinpoint changes, file sizes & included
wasm-objdump -h
logs:The TL;DR of our build config is
opt-level = "s"
andlto = true
, and then usingwasm-pack
to also optimise for size & strip debug info. In order to build this particular codec, you need to go to thecodecs/hqx
folder and runnpm run build
inside. It will take care of downloading the latest Rust Docker image and then building the codec withwasm-pack build
. Alternatively, you can usewasm-pack
or evencargo build
directly in the folder, assuming you've set correct Rust versions to reproduce the issue.The logs above can be a bit verbose, and raw file sizes reflect changes also in size of debug sections and such, which are not very interesting in this context. Where I use
1.41+
or1.44+
, it means that following versions exhibit pretty much same sizes par the normal fluctuation, and only versions with significant increase are kept.To make changes a bit easier to analyse, I've split out only code and data section sizes in the following spreadsheet: https://docs.google.com/spreadsheets/d/1ToE7Th7fp_VuQwws45ZgwBV1Pg09U061na0yFDaLwpk/edit?usp=sharing
Here is the graph showing the code and data increase between those version groups:
As you can see from raw logs, 1.44+ produces smaller raw file but it has comparable code and data sections sizes, and remains at the 1.41+ level after wasm-opt, which suggests the decrease is mainly around debug info, and not very interesting to us.
However, the change between 1.40 and 1.41 is more radical: the data section has increased from 12KB to 100KB (by 88KB or 8.3x of the original), and, while the code section almost hasn't changed, it can't be optimised by wasm-opt as well anymore.
I don't have enough insight and didn't dig deeper into Wasm, but suspect this is not a separate issue, but related to the data section increase - probably some data sections kept by Rust / LLVM, consequently, don't allow wasm-opt to DCE out some unused code that could be removed before.
Would appreciate if someone on the Rust side could take over further investigation and happy to help out with build instructions to reproduce. Although we use Dockerfiles, so it should be fairly straightforward to build.
Thanks!
The text was updated successfully, but these errors were encountered: