Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libgcc_13: SIGABORT inside UnwindRegistration::drop, __deregister_frame. #7997

Closed
Mr-Leshiy opened this issue Feb 26, 2024 · 8 comments · Fixed by #8028
Closed

libgcc_13: SIGABORT inside UnwindRegistration::drop, __deregister_frame. #7997

Mr-Leshiy opened this issue Feb 26, 2024 · 8 comments · Fixed by #8028
Labels
bug Incorrect behavior in the current implementation that needs fixing

Comments

@Mr-Leshiy
Copy link

Mr-Leshiy commented Feb 26, 2024

We have faced a SIGABORT failure during executing simple benchmark which you can find here https://github.com/input-output-hk/hermes/blob/f4d20fc06f0b558be4805a106db32480bcef649d/hermes/bin/src/wasm/module.rs#L118.
Stack trace:

#0  __restore_sigs (set=set@entry=0xffffc437ab50) at ./arch/aarch64/syscall_arch.h:48
#1  0x0000ffff9c2add5c in raise (sig=sig@entry=6) at src/signal/raise.c:11
#2  0x0000ffff9c27cbf4 in abort () at src/exit/abort.c:11
#3  0x0000ffff9c23e8fc in __deregister_frame_info_bases () from /usr/lib/libgcc_s.so.1
#4  0x0000ffff9c23e91c in __deregister_frame () from /usr/lib/libgcc_s.so.1
#5  0x0000aaaac2c0d0a8 in wasmtime_runtime::sys::unix::unwind::{impl#1}::drop (self=<optimized out>) at src/sys/unix/unwind.rs:87
#6  0x0000aaaac2ba2ba8 in core::ptr::drop_in_place<wasmtime_runtime::sys::unix::unwind::UnwindRegistration> ()
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#7  core::ptr::drop_in_place<core::option::Option<wasmtime_runtime::sys::unix::unwind::UnwindRegistration>> ()
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#8  0x0000aaaac2bb4d5c in core::mem::manually_drop::ManuallyDrop<core::option::Option<wasmtime_runtime::sys::unix::unwind::UnwindRegistration>>::drop<core::option::Option<wasmtime_runtime::sys::unix::unwind::UnwindRegistration>> (slot=0x0)
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/mem/manually_drop.rs:144
--Type <RET> for more, q to quit, c to continue without paging--
#9  wasmtime_jit::code_memory::{impl#0}::drop (self=0xffff9c216eb0) at src/code_memory.rs:42
#10 core::ptr::drop_in_place<wasmtime_jit::code_memory::CodeMemory> ()
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#11 alloc::sync::Arc<wasmtime_jit::code_memory::CodeMemory, alloc::alloc::Global>::drop_slow<wasmtime_jit::code_memory::CodeMemory, alloc::alloc::Global> (self=<optimized out>) at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/sync.rs:1752
#12 0x0000aaaac27cfcb8 in alloc::sync::{impl#33}::drop<wasmtime_jit::code_memory::CodeMemory, alloc::alloc::Global> (
    self=0xffff9c213540) at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/sync.rs:2408
#13 core::ptr::drop_in_place<alloc::sync::Arc<wasmtime_jit::code_memory::CodeMemory, alloc::alloc::Global>> ()
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#14 core::ptr::drop_in_place<wasmtime::code::CodeObject> ()
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#15 alloc::sync::Arc<wasmtime::code::CodeObject, alloc::alloc::Global>::drop_slow<wasmtime::code::CodeObject, alloc::alloc::Global> (
    self=<optimized out>) at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/sync.rs:1752
--Type <RET> for more, q to quit, c to continue without paging--
#16 0x0000aaaac27d06a8 in alloc::sync::{impl#33}::drop<wasmtime::code::CodeObject, alloc::alloc::Global> (self=0x0)
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/sync.rs:2408
#17 core::ptr::drop_in_place<alloc::sync::Arc<wasmtime::code::CodeObject, alloc::alloc::Global>> ()
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#18 core::ptr::drop_in_place<wasmtime::component::component::ComponentInner> ()
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#19 alloc::sync::Arc<wasmtime::component::component::ComponentInner, alloc::alloc::Global>::drop_slow<wasmtime::component::component::ComponentInner, alloc::alloc::Global> (self=<optimized out>)
    at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/sync.rs:1752
#20 0x0000aaaac275d494 in alloc::sync::{impl#33}::drop<wasmtime::component::component::ComponentInner, alloc::alloc::Global> (
    self=0xffffc437ad40) at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/sync.rs:2408
#21 core::ptr::drop_in_place<alloc::sync::Arc<wasmtime::component::component::ComponentInner, alloc::alloc::Global>> ()
    at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
--Type <RET> for more, q to quit, c to continue without paging--
#22 core::ptr::drop_in_place<wasmtime::component::component::Component> ()
    at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#23 core::ptr::drop_in_place<wasmtime::component::instance::InstancePre<hermes::state::HermesState>> ()
    at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#24 0x0000aaaac275b390 in core::ptr::drop_in_place<hermes::wasm::module::Module> ()
    at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498
#25 0x0000aaaac2761460 in hermes::wasm::module::bench::module_hermes_component_bench (b=0xffffc437ae30) at bin/src/wasm/module.rs:143

Steps to Reproduce

You can fully reproduce it in already prepared environment and code but cloning our repo and specifically this branch libgcc-wasmtime-unwinding-bug. (repo: https://github.com/input-output-hk/hermes)

Before start you will need to install earthly tool.
To run it you will need to execute the following in the root of the repo

earthly -I ./hermes+alpine-3-19-fail

This will prepare an alpine:3.19 environment with rust:1.75 version and all our project configuration that we have an all source code and run a benchmarks.
Earthfile code https://github.com/input-output-hk/hermes/blob/libgcc-wasmtime-unwinding-bug/hermes/Earthfile#L18

alpine-3-19-fail:
    FROM rust:1.75-alpine3.19
    
    # Install necessary packages
    RUN apk add --no-cache \
            musl-dev \
            gdb

    COPY --dir .cargo .config crates bin .
    COPY Cargo.toml .
    COPY clippy.toml deny.toml rustfmt.toml .

    RUN mkdir /wasm
    COPY --dir ../wasm+wasi-src/wasi /wasm/wasi
    # Compiled WASM component for benchmarks
    COPY ../wasm/c+build/component.wasm /wasm/c/bench_component.wasm

    # Run benchmarks
    RUN cargo bench --features bench

After the failure it will run an interacting mode -i flag (similar to docker interactive mode).
So you will be able to collect a dump file inside this container.

ulimit -c unlimited && ./target/release/deps/module-<> --bench

Then inspect it and using gdb

gdb  ./target/release/deps/module-<> core -- --bench

Actual Results

SIGABORT

Versions and Environment

Wasmtime version or commit: 17.0.0

Operating system: alpine:3.19

Extra Info

This is totally works on alpine:3.18 version with the libgcc:12 version.
We have found that most probable it is an issue with libgcc because, there is a huge difference with the libgcc unwinding implementation between these two versions.

libgcc_12 - https://github.com/gcc-mirror/gcc/blob/8cbb2cade4c724760c868c9f493b169d6ec4168a/libgcc/unwind-dw2-fde.c#L201

libgcc_13 - https://github.com/gcc-mirror/gcc/blob/0e7bc3eaa36b81004b799124d2fe00137401a43b/libgcc/unwind-dw2-fde.c#L225

You can try it running already prepared earthly target https://github.com/input-output-hk/hermes/blob/libgcc-wasmtime-unwinding-bug/hermes/Earthfile#L38

earthly ./hermes+alpine-3-18-pass
@Mr-Leshiy Mr-Leshiy added the bug Incorrect behavior in the current implementation that needs fixing label Feb 26, 2024
@Mr-Leshiy Mr-Leshiy changed the title libgcc: SIGABORT inside UnwindRegistration::drop, __deregister_frame. libgcc_13: SIGABORT inside UnwindRegistration::drop, __deregister_frame. Feb 26, 2024
@alexcrichton
Copy link
Member

Thanks for the report! Would you be able to upload the wasm file here directly? Also would you be able to share a copy/paste of the stack trace you're seeing? I can't reproduce locally with a "hello-world" style module loaded into alpine:3.19 so it seems that the issue may be related to the specific unwind table for the wasm file. I don't know much about earthly myself so I'm hopeful that the issue can be reduced relatively quickly to just a wasm file and some simple interactions with the wasmtime crate.

@Mr-Leshiy
Copy link
Author

@alexcrichton Yea, this bug appears with this specific WASM component.

Give me some time I will try to make a separate simple project reproducing this thing.

bench_component.wasm.zip

@alexcrichton
Copy link
Member

Ok thanks for the component! Looks like just compiling the component, dropping it, and doing that a few times is not sufficient to trigger any bugs. A more minimal repro would be much appreciated, even if it's just a high-level description of how the component is instantiated and worked with. A runnable repro would be perfect!

Otherwise though I'll try to dig in later this week and see if I can't see anything that's awry.

@alexcrichton
Copy link
Member

Ok I'm able to reproduce this with:

use wasmtime::*;

fn main() {
    for i in 0.. {
        println!("{i}");
        let mut config = Config::new();
        config.parallel_compilation(false);
        let e = Engine::new(&config).unwrap();
        let m = component::Component::from_file(&e, "/src/bench_component.wasm").unwrap();
    }
}

when specifically compiling with RUSTFLAGS='-Ctarget-feature=-crt-static' in a rust:1.75-alpine3.19 container.

Still trying to figure out what's going on here, but in the meantime setting this config to false should fix the issue.

alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue Feb 29, 2024
When native unwinding information is enabled Wasmtime will use the
`__register_frame` API on Unix platforms to inform the runtime of the
unwinding information generated for wasm modules. This function,
however, has a different interface in libgcc vs libunwind. This means
that we need to detect which is being used and adapt our calls to it
appropriately.

Previously this decision was static. FreeBSD and Linux glibc would
assume libgcc and everything else was assumed to be libunwind. It's
possible to use libgcc on other platforms, however, such as with musl.
The goal of this PR is to make the detection here more robust.

Specifically this PR now probes for a symbol at runtime rather than
relying on a compile-time decision. That way we can see what we got at
runtime and make the decision then.

Closes bytecodealliance#7997
@alexcrichton
Copy link
Member

Ok and I think that #8028 should completely fix this

@Mr-Leshiy
Copy link
Author

@alexcrichton BTW don't you think it could be a bug inside the libgcc as well ?

@alexcrichton
Copy link
Member

Perhaps! I wouldn't be able to conclude that with any certainty though. The libgcc code is pretty opaque to me. I think it's probably lucky that this worked before because libgcc's interface is clearly different than libunwind's and we were using the wrong one on musl. With the above PR we should correctly use the libgcc-desired interface on musl by seeing that libunwind isn't available.

@Mr-Leshiy
Copy link
Author

Mr-Leshiy commented Feb 29, 2024

I see, thanks a lot for your so quick response !
Will apply this patch to our code in days and let you know how it works.

github-merge-queue bot pushed a commit that referenced this issue Feb 29, 2024
When native unwinding information is enabled Wasmtime will use the
`__register_frame` API on Unix platforms to inform the runtime of the
unwinding information generated for wasm modules. This function,
however, has a different interface in libgcc vs libunwind. This means
that we need to detect which is being used and adapt our calls to it
appropriately.

Previously this decision was static. FreeBSD and Linux glibc would
assume libgcc and everything else was assumed to be libunwind. It's
possible to use libgcc on other platforms, however, such as with musl.
The goal of this PR is to make the detection here more robust.

Specifically this PR now probes for a symbol at runtime rather than
relying on a compile-time decision. That way we can see what we got at
runtime and make the decision then.

Closes #7997
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior in the current implementation that needs fixing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants