Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld][WebAssembly] Fix bitcode LTO order in archive parsing #73095

Merged
merged 7 commits into from
Nov 29, 2023

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Nov 22, 2023

tl;dr:
When doing LTO on multiple archives, the order with which bitcodes are linked to the LTO module is hard to control, given that processing undefined symbols can lead to parsing of an object file, which in turn lead to parsing of another object file before finishing parsing of the previous file. This can result in encountering a non-prevailing comdat first when linking, which can make the the symbol undefined, and the real definition is added later with an additional prefix to avoid duplication (e.g. __cxx_global_var_init and __cxx_global_var_init.2)

So this one-line fix ensures we compile bitcodes in the order that we process comdats, so that when multiple archived bitcode files have the same variable with the same comdat, we make sure that the prevailing comdat will be linked first in the LTO.

Fixes #62243.


  • Long version, feel free to skip:

When linking (non-archived) bitcode files directly in LTO, we add files in bitcodeFiles vector in the order they are given in the command line:

// Add all files to the symbol table. This will add almost all
// symbols that we need to the symbol table.
for (InputFile *f : files)
symtab->addFile(f);
->
bitcodeFiles.push_back(f);

The order they are added in bitcodeFiles is the order they are compiled and linked into the LTO module later.

While parsing these bitcode files, we also add symbols to the symbol table. Depending on their status, they are added as defined symbols or undefined symbols.


On the other hand, when linking archives that contain bitcode files, the order bitcode files are added to bitcodeFiles vector is hard to predict.

When parsing archives, firstly, symbols are parsed as lazy symbols:

// Read the symbol table to construct Lazy symbols.
int count = 0;
for (const Archive::Symbol &sym : file->symbols()) {
symtab->addLazy(this, &sym);
++count;
}

And then we handle exported symbols here:

// Handle the `--export <sym>` options
// This works like --undefined but also exports the symbol if its found
for (auto &iter : config->exportedSymbols)
handleUndefined(iter.first(), "--export");
where we try to fetch lazy symbols' original objects:
lazySym->fetch();
->
void LazySymbol::fetch() { cast<ArchiveFile>(file)->addMember(&archiveSymbol); }
which causes the corresponding object file to be added to the symbol table:
symtab->addFile(obj, sym->getName());
and the object file is parsed:
f->parse(symName);

But this parse call can lead to the parsing of other files before it is added to bitcodeFiles in the next line. For example, we have two bitcode files a.o and b.o, each of which is archived as liba.a and libb.a. Both files have a variable with the same name and the same comdat. When handling undefined symbols, suppose we start from a symbol from a.o. We arrive here and all comdats in a.o are added to keptComdats:

for (std::pair<StringRef, Comdat::SelectionKind> s : obj->getComdatTable())
keptComdats.push_back(symtab->addComdat(s.first));
And we call createBitCodeSymbol on all symbols in a.o:
for (const lto::InputFile::Symbol &objSym : obj->symbols())
symbols.push_back(createBitcodeSymbol(keptComdats, objSym, *this));
But this can cause the loading of another object in case some of those symbols are undefined functions or data, like here:
return symtab->addUndefinedFunction(name, std::nullopt, std::nullopt,
flags, &f, nullptr, true);
->
lazy->fetch();

So effectively, in the middle of parsing a.o, we proceed to parse b.o. And when parsing b.o, we encounter the comdat variable. But because we already added that comdat in keptComdats when parsing a.o, so b.o's comdat variable ends up being excluded, so it is added as an undefined data, whereas a.o's comdat variable will be added as a defined data:

bool excludedByComdat = c != -1 && !keptComdats[c];
if (objSym.isUndefined() || excludedByComdat) {
flags |= WASM_SYMBOL_UNDEFINED;
if (objSym.isExecutable())
return symtab->addUndefinedFunction(name, std::nullopt, std::nullopt,
flags, &f, nullptr, true);
return symtab->addUndefinedData(name, flags, &f);
}
if (objSym.isExecutable())
return symtab->addDefinedFunction(name, flags, &f, nullptr);
return symtab->addDefinedData(name, flags, &f, nullptr, 0, 0);

For a lazy symbol, addUndefinedData does not replace the symbol unless it is the first time being inserted, but addDefinedData does:

if (wasInserted || s->isLazy()) {
replaceSym();
return s;
}
And because of this a.o's comdat is to be the prevailing one, because the symbol's getFile() will return a.o:
r.Prevailing = !objSym.isUndefined() && sym->getFile() == &f;
But because b.o's parsing started in the middle of a.os parsing, it ends first and gets added to bitcodeFiles first.
bitcodeFiles.push_back(f);

So to sum up, bitcodeFiles now contains [b.o, a.o] but a.o has the prevailing comdat. This does not happen when directly linking bitcodes (without archives) because symbols are added in one-pass in the order specified in the command line. This also does not happen in non-LTO (even with archives), because object parsing uses ObjFile::parse which is a separate process from this bitcode processing.


I'm not sure if there is a rule that the prevailing comdat has to exist in the first file having that comdat within bitcodeFiles, but my observation says this is likely the case.

When adding bitcodes to the LTO, if a symbol's comdat is not a prevailing one, this code demotes its linkage to available_externally:

if (Res.Prevailing) {
if (Sym.isUndefined())
continue;
Mod.Keep.push_back(GV);
// For symbols re-defined with linker -wrap and -defsym options,
// set the linkage to weak to inhibit IPO. The linkage will be
// restored by the linker.
if (Res.LinkerRedefined)
GV->setLinkage(GlobalValue::WeakAnyLinkage);
GlobalValue::LinkageTypes OriginalLinkage = GV->getLinkage();
if (GlobalValue::isLinkOnceLinkage(OriginalLinkage))
GV->setLinkage(GlobalValue::getWeakLinkage(
GlobalValue::isLinkOnceODRLinkage(OriginalLinkage)));
} else if (isa<GlobalObject>(GV) &&
(GV->hasLinkOnceODRLinkage() || GV->hasWeakODRLinkage() ||
GV->hasAvailableExternallyLinkage()) &&
!AliasedGlobals.count(cast<GlobalObject>(GV))) {
// Any of the above three types of linkage indicates that the
// chosen prevailing symbol will have the same semantics as this copy of
// the symbol, so we may be able to link it with available_externally
// linkage. We will decide later whether to do that when we link this
// module (in linkRegularLTO), based on whether it is undefined.
Mod.Keep.push_back(GV);
GV->setLinkage(GlobalValue::AvailableExternallyLinkage);
if (GV->hasComdat())
NonPrevailingComdats.insert(GV->getComdat());
and subsequently all other variables that are associated with the same comdat in that object file are demoted in the same way.
// Additionally need to drop all global values from the comdat to
// available_externally, to satisfy the COMDAT requirement that all members
// are discarded as a unit. The non-local linkage global values avoid
// duplicate definition linker errors.
GV.setLinkage(GlobalValue::AvailableExternallyLinkage);
In out case, because of the orders of bitcodeFiles, we start from b.o, and the comdat symbols there become available_externally. This is how __cxx_global_var_init in global_var_init2.ll in the attached test becomes available_externally.

When linking those bitcodes in the LTO, if we traverse symbols in the order of prevailing->non-prevailing, we end up keeping only one global variable, but if we traverse in the other way around (i.e., non-prevailing first), we end up adding two symbols, because it lets us add a available_externally symbol when we don't already have a definition for it:

if (!GV->hasAvailableExternallyLinkage()) {
Keep.push_back(GV);
continue;
}
// Only link available_externally definitions if we don't already have a
// definition.
GlobalValue *CombinedGV =
RegularLTO.CombinedModule->getNamedValue(GV->getName());
if (CombinedGV && !CombinedGV->isDeclaration())
continue;
Keep.push_back(GV);
Also when it is an available_externally symbol, it becomes a declaration, not a definition. So we end up with a module like this:

declare dso_local void @__cxx_global_var_init()
define internal void @__cxx_global_var_init.2() comdat($unused) {
  ...
}

So this one-line fix ensures we compile bitcodes in the order that we process comdats, so that when multiple archived bitcode files have the same variable with the same comdat, we make sure that the prevailing comdat will be linked first in the LTO.


This crash is said to be happening after 12050a3 but even reverting this is not really a solution for us because we end up with two definitions if we do that:

define internal void @__cxx_global_var_init() comdat($unused) {
  ...
}
define internal void @__cxx_global_var_init.2() comdat($unused) {
  ...
}

And the wasm's __wasm_call_ctors will be like

 (func $__wasm_call_ctors
  (call $emscripten_stack_init)
  (call $__cxx_global_var_init)
  (call $__cxx_global_var_init.2)
 )

tl;dr:
When doing LTO on multiple archives, the order with which bitcodes are
linked to the LTO module is hard to control, given that processing
undefined symbols can lead to parsing of an object file, which in turn
lead to parsing of another object file before finishing parsing of the
previous file. This can result in encountering a non-prevailing comdat
first when linking, which can make the the symbol undefined, and the
real definition is added later with an additional prefix to avoid
duplication (e.g. `__cxx_global_var_init` and `__cxx_global_var_init.2`)

So this one-line fix ensures we compile bitcodes in the order that we
process comdats, so that when multiple archived bitcode files have the
same variable with the same comdat, we make sure that the prevailing
comdat will be linked first in the LTO.

Fixes llvm#62243.

---

- Long version, feel free to skip:

When linking (non-archived) bitcode files directly in LTO, we add files
in `bitcodeFiles` vector in the order they are given in the command
line:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/Driver.cpp#L1201-L1204 ->
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/SymbolTable.cpp#L54

The order they are added in `bitcodeFiles` is the order they are
compiled and linked into the LTO module later.

While parsing these bitcode files, we also add symbols to the symbol
table. Depending on their status, they are added as defined symbols or
undefined symbols.

---

On the other hand, when linking archives that contain bitcode files, the
order bitcode files are added to `bitcodeFiles` vector is hard to
predict.

When parsing archives, firstly, symbols are parsed as lazy symbols:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/InputFiles.cpp#L734-L739

And then we handle undefined symbols here:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/Driver.cpp#L1208-L1210
where we try to fetch lazy symbols' original objects:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/Driver.cpp#L708 ->
https://github.com/llvm/llvm-project/blob/907ed77ad1d7a154317e5f8398d17d441711dc38/lld/wasm/Symbols.cpp#L428
which causes the corresponding object file to be added to the symbol
table:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/InputFiles.cpp#L763
and the object file is parsed:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/SymbolTable.cpp#L53

But this `parse` call can lead to the parsing of other files before it
is added to `bitcodeFiles` in the next line. For example, we have two
bitcode files `a.o` and `b.o`, each of which is archived as `liba.a` and
`libb.a`. Both files have a variable with the same name and the same
comdat. When handling undefined symbols, suppose we start from a symbol
from `a.o`. We arrive here and all comdats in `a.o` are added to
`keptComdats`:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/InputFiles.cpp#L844-L845
And we call `createBitCodeSymbol` on all symbols in `a.o`:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/InputFiles.cpp#L847-L848
But this can cause the loading of another object in case some of those
symbols are undefined functions or data, like here:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/InputFiles.cpp#L791-L792
->
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/SymbolTable.cpp#L534

So effectively, in the middle of parsing `a.o`, we proceed to parse
`b.o`. And when parsing `b.o`, we encounter the comdat variable. But
because we already added that comdat in `keptComdats` when parsing
`a.o`, so `b.o`'s comdat variable ends up being excluded, so it is added
as an undefined data, whereas `a.o`'s comdat variable will be added as a
defined data:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/InputFiles.cpp#L786-L798

For a lazy symbol, `addUndefinedData` does not replace the symbol unless
it is the first time being inserted, but `addDefinedData` does:
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/SymbolTable.cpp#L382-L385
And because of this `a.o`'s comdat is to be the prevailing one, because
the symbol's `getFile()` will return `a.o`:
https://github.com/llvm/llvm-project/blob/907ed77ad1d7a154317e5f8398d17d441711dc38/lld/wasm/LTO.cpp#L104
But because `b.o`'s parsing started in the middle of `a.o`s parsing, it
ends first and gets added to `bitcodeFiles` first.
https://github.com/llvm/llvm-project/blob/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286/lld/wasm/SymbolTable.cpp#L54

So to sum up, `bitcodeFiles` now contains [`b.o`, `a.o`] but `a.o` has
the prevailing comdat. This does not happen when directly linking
bitcodes (without archives) because symbols are added in one-pass in the
order specified in the command line. This also does not happen in
non-LTO (even with archives), because object parsing uses
`ObjFile::parse` which is a separate process from this bitcode
processing.

---

I'm not sure if there is a rule that the prevailing comdat has to exist
in the first file having that comdat within `bitcodeFiles`, but my
observation says this is likely the case.

When adding bitcodes to the LTO, if a symbol's comdat is not a
prevailing one, this code demotes its linkage to `available_externally`:
https://github.com/llvm/llvm-project/blob/23c84fb362849865990c0e160158b19f54742147/llvm/lib/LTO/LTO.cpp#L869-L895
and subsequently all other variables that are associated with the same
comdat in that object file are demoted in the same way.
https://github.com/llvm/llvm-project/blob/23c84fb362849865990c0e160158b19f54742147/llvm/lib/LTO/LTO.cpp#L788-L792
In out case, because of the orders of `bitcodeFiles`, we start from
`b.o`, and the comdat symbols there become `available_externally`. This
is how `__cxx_global_var_init` in `global_var_init2.ll` in the attached
test becomes `available_externally`.

When linking those bitcodes in the LTO, if we traverse symbols in
the order of prevailing->non-prevailing, we end up keeping only one
global variable, but if we traverse in the other way around (i.e.,
non-prevailing first), we end up adding two symbols, because it lets us
add a `available_externally` symbol when we don't already have a
definition for it:
https://github.com/llvm/llvm-project/blob/23c84fb362849865990c0e160158b19f54742147/llvm/lib/LTO/LTO.cpp#L974-L986
Also when it is an `available_externally` symbol, it becomes a
declaration, not a definition. So we end up with a module like this:
```ll
declare dso_local void @__cxx_global_var_init()
define internal void @__cxx_global_var_init.2() comdat($unused) {
  ...
}
```

---

So this one-line fix ensures we compile bitcodes in the order that we
process comdats, so that when multiple archived bitcode files have the
same variable with the same comdat, we make sure that the prevailing
comdat will be linked first in the LTO.

---

This crash is said to be happening after llvm@12050a3
but even reverting this is not really a solution for us because we end
up with two definitions if we do that:
```ll
define internal void @__cxx_global_var_init() comdat($unused) {
  ...
}
define internal void @__cxx_global_var_init.2() comdat($unused) {
  ...
}
```
And the wasm's `__wasm_call_ctors` will be like
```wast
 (func $__wasm_call_ctors
  (call $emscripten_stack_init)
  (call $__cxx_global_var_init)
  (call $__cxx_global_var_init.2)
 )
```
@llvmbot
Copy link
Member

llvmbot commented Nov 22, 2023

@llvm/pr-subscribers-lld-wasm

@llvm/pr-subscribers-lld

Author: Heejin Ahn (aheejin)

Changes

tl;dr:
When doing LTO on multiple archives, the order with which bitcodes are linked to the LTO module is hard to control, given that processing undefined symbols can lead to parsing of an object file, which in turn lead to parsing of another object file before finishing parsing of the previous file. This can result in encountering a non-prevailing comdat first when linking, which can make the the symbol undefined, and the real definition is added later with an additional prefix to avoid duplication (e.g. __cxx_global_var_init and __cxx_global_var_init.2)

So this one-line fix ensures we compile bitcodes in the order that we process comdats, so that when multiple archived bitcode files have the same variable with the same comdat, we make sure that the prevailing comdat will be linked first in the LTO.

Fixes #62243.


  • Long version, feel free to skip:

When linking (non-archived) bitcode files directly in LTO, we add files in bitcodeFiles vector in the order they are given in the command line:

// Add all files to the symbol table. This will add almost all
// symbols that we need to the symbol table.
for (InputFile *f : files)
symtab->addFile(f);
->
bitcodeFiles.push_back(f);

The order they are added in bitcodeFiles is the order they are compiled and linked into the LTO module later.

While parsing these bitcode files, we also add symbols to the symbol table. Depending on their status, they are added as defined symbols or undefined symbols.


On the other hand, when linking archives that contain bitcode files, the order bitcode files are added to bitcodeFiles vector is hard to predict.

When parsing archives, firstly, symbols are parsed as lazy symbols:

// Read the symbol table to construct Lazy symbols.
int count = 0;
for (const Archive::Symbol &sym : file->symbols()) {
symtab->addLazy(this, &sym);
++count;
}

And then we handle undefined symbols here:

// Handle the `--undefined <sym>` options.
for (auto *arg : args.filtered(OPT_undefined))
handleUndefined(arg->getValue(), "<internal>");
where we try to fetch lazy symbols' original objects:
lazySym->fetch();
->
void LazySymbol::fetch() { cast<ArchiveFile>(file)->addMember(&archiveSymbol); }
which causes the corresponding object file to be added to the symbol table:
symtab->addFile(obj, sym->getName());
and the object file is parsed:
f->parse(symName);

But this parse call can lead to the parsing of other files before it is added to bitcodeFiles in the next line. For example, we have two bitcode files a.o and b.o, each of which is archived as liba.a and libb.a. Both files have a variable with the same name and the same comdat. When handling undefined symbols, suppose we start from a symbol from a.o. We arrive here and all comdats in a.o are added to keptComdats:

for (std::pair<StringRef, Comdat::SelectionKind> s : obj->getComdatTable())
keptComdats.push_back(symtab->addComdat(s.first));
And we call createBitCodeSymbol on all symbols in a.o:
for (const lto::InputFile::Symbol &objSym : obj->symbols())
symbols.push_back(createBitcodeSymbol(keptComdats, objSym, *this));
But this can cause the loading of another object in case some of those symbols are undefined functions or data, like here:
return symtab->addUndefinedFunction(name, std::nullopt, std::nullopt,
flags, &f, nullptr, true);
->
lazy->fetch();

So effectively, in the middle of parsing a.o, we proceed to parse b.o. And when parsing b.o, we encounter the comdat variable. But because we already added that comdat in keptComdats when parsing a.o, so b.o's comdat variable ends up being excluded, so it is added as an undefined data, whereas a.o's comdat variable will be added as a defined data:

bool excludedByComdat = c != -1 && !keptComdats[c];
if (objSym.isUndefined() || excludedByComdat) {
flags |= WASM_SYMBOL_UNDEFINED;
if (objSym.isExecutable())
return symtab->addUndefinedFunction(name, std::nullopt, std::nullopt,
flags, &f, nullptr, true);
return symtab->addUndefinedData(name, flags, &f);
}
if (objSym.isExecutable())
return symtab->addDefinedFunction(name, flags, &f, nullptr);
return symtab->addDefinedData(name, flags, &f, nullptr, 0, 0);

For a lazy symbol, addUndefinedData does not replace the symbol unless it is the first time being inserted, but addDefinedData does:

if (wasInserted || s->isLazy()) {
replaceSym();
return s;
}
And because of this a.o's comdat is to be the prevailing one, because the symbol's getFile() will return a.o:
r.Prevailing = !objSym.isUndefined() && sym->getFile() == &f;
But because b.o's parsing started in the middle of a.os parsing, it ends first and gets added to bitcodeFiles first.
bitcodeFiles.push_back(f);

So to sum up, bitcodeFiles now contains [b.o, a.o] but a.o has the prevailing comdat. This does not happen when directly linking bitcodes (without archives) because symbols are added in one-pass in the order specified in the command line. This also does not happen in non-LTO (even with archives), because object parsing uses ObjFile::parse which is a separate process from this bitcode processing.


I'm not sure if there is a rule that the prevailing comdat has to exist in the first file having that comdat within bitcodeFiles, but my observation says this is likely the case.

When adding bitcodes to the LTO, if a symbol's comdat is not a prevailing one, this code demotes its linkage to available_externally:

if (Res.Prevailing) {
if (Sym.isUndefined())
continue;
Mod.Keep.push_back(GV);
// For symbols re-defined with linker -wrap and -defsym options,
// set the linkage to weak to inhibit IPO. The linkage will be
// restored by the linker.
if (Res.LinkerRedefined)
GV->setLinkage(GlobalValue::WeakAnyLinkage);
GlobalValue::LinkageTypes OriginalLinkage = GV->getLinkage();
if (GlobalValue::isLinkOnceLinkage(OriginalLinkage))
GV->setLinkage(GlobalValue::getWeakLinkage(
GlobalValue::isLinkOnceODRLinkage(OriginalLinkage)));
} else if (isa<GlobalObject>(GV) &&
(GV->hasLinkOnceODRLinkage() || GV->hasWeakODRLinkage() ||
GV->hasAvailableExternallyLinkage()) &&
!AliasedGlobals.count(cast<GlobalObject>(GV))) {
// Any of the above three types of linkage indicates that the
// chosen prevailing symbol will have the same semantics as this copy of
// the symbol, so we may be able to link it with available_externally
// linkage. We will decide later whether to do that when we link this
// module (in linkRegularLTO), based on whether it is undefined.
Mod.Keep.push_back(GV);
GV->setLinkage(GlobalValue::AvailableExternallyLinkage);
if (GV->hasComdat())
NonPrevailingComdats.insert(GV->getComdat());
and subsequently all other variables that are associated with the same comdat in that object file are demoted in the same way.
// Additionally need to drop all global values from the comdat to
// available_externally, to satisfy the COMDAT requirement that all members
// are discarded as a unit. The non-local linkage global values avoid
// duplicate definition linker errors.
GV.setLinkage(GlobalValue::AvailableExternallyLinkage);
In out case, because of the orders of bitcodeFiles, we start from b.o, and the comdat symbols there become available_externally. This is how __cxx_global_var_init in global_var_init2.ll in the attached test becomes available_externally.

When linking those bitcodes in the LTO, if we traverse symbols in the order of prevailing->non-prevailing, we end up keeping only one global variable, but if we traverse in the other way around (i.e., non-prevailing first), we end up adding two symbols, because it lets us add a available_externally symbol when we don't already have a definition for it:

if (!GV->hasAvailableExternallyLinkage()) {
Keep.push_back(GV);
continue;
}
// Only link available_externally definitions if we don't already have a
// definition.
GlobalValue *CombinedGV =
RegularLTO.CombinedModule->getNamedValue(GV->getName());
if (CombinedGV && !CombinedGV->isDeclaration())
continue;
Keep.push_back(GV);
Also when it is an available_externally symbol, it becomes a declaration, not a definition. So we end up with a module like this:

declare dso_local void @<!-- -->__cxx_global_var_init()
define internal void @<!-- -->__cxx_global_var_init.2() comdat($unused) {
  ...
}

So this one-line fix ensures we compile bitcodes in the order that we process comdats, so that when multiple archived bitcode files have the same variable with the same comdat, we make sure that the prevailing comdat will be linked first in the LTO.


This crash is said to be happening after 12050a3 but even reverting this is not really a solution for us because we end up with two definitions if we do that:

define internal void @<!-- -->__cxx_global_var_init() comdat($unused) {
  ...
}
define internal void @<!-- -->__cxx_global_var_init.2() comdat($unused) {
  ...
}

And the wasm's __wasm_call_ctors will be like

 (func $__wasm_call_ctors
  (call $emscripten_stack_init)
  (call $__cxx_global_var_init)
  (call $__cxx_global_var_init.2)
 )

Full diff: https://github.com/llvm/llvm-project/pull/73095.diff

4 Files Affected:

  • (added) lld/test/wasm/lto/Inputs/global_var_init1.ll (+33)
  • (added) lld/test/wasm/lto/Inputs/global_var_init2.ll (+30)
  • (added) lld/test/wasm/lto/global_var_init.test (+17)
  • (modified) lld/wasm/SymbolTable.cpp (+1-1)
diff --git a/lld/test/wasm/lto/Inputs/global_var_init1.ll b/lld/test/wasm/lto/Inputs/global_var_init1.ll
new file mode 100644
index 000000000000000..1478a9070ea0ec8
--- /dev/null
+++ b/lld/test/wasm/lto/Inputs/global_var_init1.ll
@@ -0,0 +1,33 @@
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128"
+target triple = "wasm32-unknown-unknown"
+
+$unused = comdat any
+
+@unused = linkonce_odr global i32 0, comdat, align 4
+@_ZGV6unused = linkonce_odr global i32 0, comdat($unused), align 4
+@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @__cxx_global_var_init, ptr @unused }]
+
+define internal void @__cxx_global_var_init() comdat($unused) {
+entry:
+  %0 = load i8, ptr @_ZGV6unused, align 4
+  %1 = and i8 %0, 1
+  %guard.uninitialized = icmp eq i8 %1, 0
+  br i1 %guard.uninitialized, label %init.check, label %init.end
+
+init.check:                                       ; preds = %entry
+  store i8 1, ptr @_ZGV6unused, align 4
+  %call = call i32 @foo()
+  store i32 %call, ptr @unused, align 4
+  br label %init.end
+
+init.end:                                         ; preds = %init.check, %entry
+  ret void
+}
+
+declare i32 @foo()
+
+define i32 @main() {
+entry:
+  %call = call i32 @foo()
+  ret i32 %call
+}
diff --git a/lld/test/wasm/lto/Inputs/global_var_init2.ll b/lld/test/wasm/lto/Inputs/global_var_init2.ll
new file mode 100644
index 000000000000000..66f60e1cf3e3bbf
--- /dev/null
+++ b/lld/test/wasm/lto/Inputs/global_var_init2.ll
@@ -0,0 +1,30 @@
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128"
+target triple = "wasm32-unknown-unknown"
+
+$unused = comdat any
+
+@unused = linkonce_odr global i32 0, comdat, align 4
+@_ZGV6unused = linkonce_odr global i32 0, comdat($unused), align 4
+@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @__cxx_global_var_init, ptr @unused }]
+
+define internal void @__cxx_global_var_init() comdat($unused) {
+entry:
+  %0 = load i8, ptr @_ZGV6unused, align 4
+  %1 = and i8 %0, 1
+  %guard.uninitialized = icmp eq i8 %1, 0
+  br i1 %guard.uninitialized, label %init.check, label %init.end
+
+init.check:                                       ; preds = %entry
+  store i8 1, ptr @_ZGV6unused, align 4
+  %call = call i32 @foo()
+  store i32 %call, ptr @unused, align 4
+  br label %init.end
+
+init.end:                                         ; preds = %init.check, %entry
+  ret void
+}
+
+define i32 @foo() {
+entry:
+  ret i32 42
+}
diff --git a/lld/test/wasm/lto/global_var_init.test b/lld/test/wasm/lto/global_var_init.test
new file mode 100644
index 000000000000000..f1b09a092a1fcfb
--- /dev/null
+++ b/lld/test/wasm/lto/global_var_init.test
@@ -0,0 +1,17 @@
+; Check if we handle __cxx_global_var_init in different LTO bitcode modules
+; sharing a comdat.
+
+; RUN: llvm-as %S/Inputs/global_var_init1.ll -o %t1.o
+; RUN: llvm-as %S/Inputs/global_var_init2.ll -o %t2.o
+; RUN: llvm-ar rcs %t1.a %t1.o
+; RUN: llvm-ar rcs %t2.a %t2.o
+; RUN: wasm-ld %t1.a %t2.a -o %t.wasm --no-entry --export=main --export=__wasm_call_ctors
+; RUN: obj2yaml %t.wasm | FileCheck %s
+
+; CHECK:       - Type:            CUSTOM
+; CHECK-NEXT:    Name:            name
+; CHECK-NEXT:    FunctionNames:
+; CHECK-NEXT:      - Index:           0
+; CHECK-NEXT:        Name:            __wasm_call_ctors
+; CHECK-NEXT:      - Index:           1
+; CHECK-NEXT:        Name:            __cxx_global_var_init
diff --git a/lld/wasm/SymbolTable.cpp b/lld/wasm/SymbolTable.cpp
index a00e336118d8c84..6d283c815d44701 100644
--- a/lld/wasm/SymbolTable.cpp
+++ b/lld/wasm/SymbolTable.cpp
@@ -50,8 +50,8 @@ void SymbolTable::addFile(InputFile *file, StringRef symName) {
 
   // LLVM bitcode file
   if (auto *f = dyn_cast<BitcodeFile>(file)) {
-    f->parse(symName);
     bitcodeFiles.push_back(f);
+    f->parse(symName);
     return;
   }
 

@llvm llvm deleted a comment from aheejin Nov 28, 2023
@llvm llvm deleted a comment from aheejin Nov 28, 2023
@llvm llvm deleted a comment from aheejin Nov 28, 2023
bitcodeFiles.push_back(f);
f->parse(symName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good change to me. The fact that ELF does it this way makes me even more confident this is correct:

ctx.bitcodeFiles.push_back(f);
f->parse();

;
; int main() {
; return foo();
; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if its possible to reduce these test files something smaller by hand writing them instead?

I'd always prefer something more concise in the test files in possible.

Also, I wonder if we even need to mention global_var_init .. presumably this problem applies to all comat groups? Should we call these something comdat_ordering.ll?

Copy link
Member Author

@aheejin aheejin Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if its possible to reduce these test files something smaller by hand writing them instead?

I'd always prefer something more concise in the test files in possible.

It was generated from that C++ code but hand-simplified later to delete some unnecessary parts. I think I can delete some more, but the overall structure is still the same. (That @llvm.global_ctors is still necessary; without this the error does not occur.) I think I can delete this comment. Also I can't completely empty out the body of @__cxx_global_var_init to something like ret void, in which case it is just deleted.

Also, I wonder if we even need to mention global_var_init .. presumably this problem applies to all comat groups? Should we call these something comdat_ordering.ll?

I don't think this problem is specific to global_var_init, but that all bug reports we got contained it kind of makes me think that the condition this bug is triggered is mostly created by global_var_init. I think I can rename the tests, but I guess it is fine to leave the variables in the tests as __cxx_global_var_init.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, no, turns out I can't really simplify the contents of @__cxx_global_var_init much. Even if I try to simplify it a little, the unused contents will be somehow deleted and @__cxx_global_var_init itself is deleted. Will rename the tests, and clarify the comment to say it is simplified after generation. (I think it is still better to have what it is generated from)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for debugging this issue! It must have been a real pain to track down. Nice to see such as simple fix though.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 28, 2023

(when landing maybe just include the short description?)

@aheejin
Copy link
Member Author

aheejin commented Nov 28, 2023

(when landing maybe just include the short description?)

Sounds good

@nico
Copy link
Contributor

nico commented Jan 4, 2024

Does doing something like #67445 instead help as well? If so, maybe go with that instead.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 4, 2024

Does doing something like #67445 instead help as well? If so, maybe go with that instead.

This change seems like a clear improvement/bugfix to me. If something like #67445 is needed then that would probably be for a separate issue, which I guess we have not yet seen?

@nico
Copy link
Contributor

nico commented Jan 4, 2024

This change seems like a clear improvement/bugfix to me. If something like #67445 is needed then that would probably be for a separate issue, which I guess we have not yet seen?

I think it prevents the "can lead to parsing of an object file, which in turn lead to parsing of another object file before finishing parsing of the previous file" issue mentioned further up, or at least limits it a bit. So maybe it helps here too. Not sure though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wasm-ld crash when LTO enabled: __cxx_global_var_init.1 present in "init functions" but not defined
4 participants