Use .init_array on all WASM targets, not just WASI/Emscripten #76

dfoxfranke · 2025-01-06T21:53:24Z

wasm-ld's support for .init_array isn't specific to WASI or Emscripten; it works regardless of what platform is being targeted. This PR adjusts the conditional-compilation attributes accordingly.

I was surprised that getting wasm32-unknown-unknown working was quite this easy, because I expected that in order to get constructors to run, I would have to explicitly invoke __wasm_call_ctors from the top of my entrypoint function, but it turns out that wasm-ld inserts this automatically!

This confused me for a while because when I looked at the disassembly I wasn't seeing any safeguards against __wasm_call_ctors being called multiple times if the embedder makes multiple calls into an instantiated module. It turns out that wasm-ld has a somewhat crummy heuristic for handling this: if you're doing what it considers a "command-style" link and you never invoke __wasm_call_ctors explicitly anywhere in your module, then it assumes that the embedder is only ever going to make one call into any given instance of the module and that it should put the __wasm_call_ctors call at the top of every exported function. Otherwise, it refrains and assumes that the embedder will take responsibility for causing __wasm_call_ctors to run at the appropriate time.

That seems a little janky to me, but the jank is orthogonal to anything that I think this crate needs to concern itself with. So I think this PR is correct, and if anything needs to change about how __wasm_call_ctors is called then that's for LLVM to deal with.

dtolnay · 2025-01-06T22:46:37Z

What happens if the embedder makes more than a single function call into the same instance of a module?

dfoxfranke · 2025-01-06T22:48:50Z

If that happens despite the heuristics determining that it shouldn't, then the constructors will get called multiple times.

dtolnay · 2025-01-06T23:05:01Z

Specifically for the constructors used by this crate, what would be the observable behavior for someone using this crate's public API?

dfoxfranke · 2025-01-06T23:13:11Z

I think you'd wind up with collections returning multiple copies of everything submitted to them.

To be clear, to whatever extent this is a problem, it's problem for all WASM targets. I don't think this PR exacerbates it.

dtolnay · 2025-01-07T05:54:40Z

The underlying data structure for inventory is a linked list of statically allocated nodes, so I was uncertain how that could end up containing a variable number of copies of each submitted value dependent on the runtime number of calls to __wasm_call_ctors. Having n duplicates of each value for n init calls is not one of the possible options with this data structure.

The only possible outcomes are:

correct behavior, i.e. init is accidentally idempotent
linked list becomes circular
undefined behavior

From inspecting Registry::submit, the actual behavior is 2, so inventory::iter::<T> would be an infinite loop. That is bad enough that I think it would need to be solved before declaring that wasm is supported. Either the submit logic needs to be changed (on cfg wasm only) or the iterator needs to correctly handle a circular list by stopping when the list's head is reached for a second time.

From https://github.com/llvm/llvm-project/blob/llvmorg-19.1.6/lld/wasm/Writer.cpp#L1525-L1591, it seems like the __wasm_call_ctors implementation in wasm-ld is geared around the following kind of code:

namespace {
std::string s = compute_s();
}

where calling the ctor on every entry into the module, and calling the dtor on every exit, is fine and doesn't result in a memory leak. It seems misguided to me, considering scenarios like:

const std::vector<T> vec = {...};

void enqueue(std::size_t i) {
  static const T *prev;
  if (prev) {
    dequeue(*prev);
  }
  prev = &vec[i];
}

which is fine on non-wasm, but on wasm, the dtor+ctor between consecutive calls would cause use-after-free.

The wasm-ld implementation I would have expected is that it generates a boolean that remembers whether constructors have already run, and only runs them on the first call. But I guess they considered that problematic due to destructors never being run.

dfoxfranke · 2025-01-07T13:40:37Z

Ah my bad; I only glanced over the linked-list code and didn't notice that the allocations were static. So yeah, a circular list would be the result here, which is rather bad because it would be difficult for users of the crate to figure out what was going wrong. I think a wasm-only change to the submit logic is the easiest and most robust solution. I'll amend this PR to add that.

dfoxfranke · 2025-01-07T16:22:20Z

I've updated this PR to ensure idempotence. I tested it by manually invoking __wasm_call_ctors() multiple times from the entry point of a test program. This confirmed both that, without this patch, it results in a circular linked list, and that it works correctly when this patch is included.

I also updated the original commit by tweaking the target conditionals to ensure that it's logically impossible for them to overlap, and added some documentation about WebAssembly support and how __wasm_call_ctors works.

dtolnay

Thanks!

dfoxfranke added 3 commits January 7, 2025 08:47

Use .init_array on all WASM targets, not just WASI/Emscripten

9db0023

Ensure that constructors are idempotent on WASM

4ca062a

Document WebAssembly support

06c226f

dfoxfranke force-pushed the master branch from 3e53c1c to 06c226f Compare January 7, 2025 16:14

dtolnay approved these changes Jan 7, 2025

View reviewed changes

dtolnay merged commit b131245 into dtolnay:master Jan 7, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use .init_array on all WASM targets, not just WASI/Emscripten #76

Use .init_array on all WASM targets, not just WASI/Emscripten #76

dfoxfranke commented Jan 6, 2025 •

edited

Loading

dtolnay commented Jan 6, 2025

dfoxfranke commented Jan 6, 2025

dtolnay commented Jan 6, 2025

dfoxfranke commented Jan 6, 2025

dtolnay commented Jan 7, 2025

dfoxfranke commented Jan 7, 2025

dfoxfranke commented Jan 7, 2025

dtolnay left a comment

Use .init_array on all WASM targets, not just WASI/Emscripten #76

Use .init_array on all WASM targets, not just WASI/Emscripten #76

Conversation

dfoxfranke commented Jan 6, 2025 • edited Loading

dtolnay commented Jan 6, 2025

dfoxfranke commented Jan 6, 2025

dtolnay commented Jan 6, 2025

dfoxfranke commented Jan 6, 2025

dtolnay commented Jan 7, 2025

dfoxfranke commented Jan 7, 2025

dfoxfranke commented Jan 7, 2025

dtolnay left a comment

Choose a reason for hiding this comment

dfoxfranke commented Jan 6, 2025 •

edited

Loading