-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/cgo, cmd/link: with zig as CC/CXX, Go linker does not put libc onto the linker line, causing undefined symbol errors #52690
Comments
With compilers like GCC and clang, the compiler links against the C library by default. Does |
|
Thanks. The Go linker never links directly against the C library. This normally works because the Go build decides that |
A typical linker line for any C compiler will look something like this:
In this example, I'm not sure what you mean when you say the Go linker never links directly against the C library. Indeed, We can simulate this same problem with any C compiler by passing
Ultimately, the Go toolchain is compiling C source files into object files which contain libc dependencies. If any C compiler is used to then link these together, everything works fine, because C compilers put libc onto the linker line. However, when the Go linker is used, which is the default, it does not satisfy its own dependencies. I think one thing that would shed some light on this issue would be seeing, side by side, the Go linker line when |
I think what you're looking for is (I'm pretty sure this is the same as #44695, but whichever gets the most traction works for me 😄) |
With
The flags are nearly identical, and none mention libc in any way. |
I mean literally that. The Go linker never opens the C library.
In the GCC case, the cgo tool is run against runtime/cgo and runtime/race. This happens when building those packages, not at link time. Among other things, cgo runs a C link to produce a temporary executable. It examines that executable for references to symbols defined in shared libraries. It passes a list of those symbols and the corresponding shared libraries to the Go linker. The Go linker uses that to build a dynamic symbol table that tells the dynamic linker ( Somehow that is failing when using |
Data point: the I am comparing workdirs of
This is how go builds the object for
(ditto for clang-13)
Result: the
The
Digging ... |
Thanks for the clues everyone.
Aha, I think we have almost gotten to the bottom of this. This strategy means that the temporary executable needs to be exemplary in terms of linking to libc. If, for instance, the temporary executable happened to have advanced linker optimizations enabled, garbage collecting unused libc dependencies, then Go's strategy of using it as an example of how to link to libc would not work. In order to avoid this situation, Go would need to make sure this temporary executable had libc function calls to all the needed symbols. One way to test this hypothesis would be with the following patch to Zig: --- a/src/link/Elf.zig
+++ b/src/link/Elf.zig
@@ -1691,9 +1691,9 @@ fn linkWithLLD(self: *Elf, comp: *Compilation, prog_node: *std.Progress.Node) !v
argv.appendAssumeCapacity(arg);
}
- if (!as_needed) {
- argv.appendAssumeCapacity("--as-needed");
- as_needed = true;
+ if (as_needed) {
+ argv.appendAssumeCapacity("--no-as-needed");
+ as_needed = false;
}
// libc++ dep However, I want to stress that using |
The patch doesn't help: the linker flag gets added, but the resulting intermediate I've extracted the
Verbose
I think it will be fair to talk more about this when we get to the bottom of the problem (and a workaround). :) |
It's
|
Adding
|
When building object files, `zig cc` will instruct lld to remove unused sections via `--gc-sections`. This is problematic for cgo. Briefly, go builds cgo executables as follows*: 1. build `_cgo_.o`, which links (on linux_amd64 systems) to `/usr/local/go/src/runtime/race/race_linux_amd64.syso`. 2. That `.syso` contains references to symbols from libc. If the actual Go program uses at least one libc symbol, it will link correctly. However, if Go is building a cgo executable, but without any C code, the sections from `.syso` file will be garbage-collected, leaving a `_cgo_.o` without any references to libc. I assume the `gc_sections` is an optimization. If yes, then it makes sense for the final executable, but not for the intermediate object. If not, please correct me. Quoting @andrewrk in [1]: > The C source code for the temporary executable needs to have dummy > calls to getuid, pthread_self, sleep, and every other libc function > that cgo/race code wants to call. I agree in this case. However, while we could potentially fix it for go, I don't know how many other systems do that, which compilcates use of `zig cc` for other projects. If we consider `zig cc` a drop-in clang replacement (except for `-fsanitize=undefined`, which I tend to agree with), then it should not be optimizing the intermediate object files. I assume this was added as an optimization. If that's correct, let's optimize the final executable, but not the intermediate objects. Fixes ziglang#11398 Fixes golang/go#44695 Fixes golang/go#52690 [*]: Empirically observed with `CGO_ENABLED=1 go test -race -x -v` [1]: golang/go#52690 (comment)
When building object files, `zig cc` will instruct lld to remove unused sections via `--gc-sections`. This is problematic cgo builds that don't explicitly use C code. Briefly, go builds cgo executables as follows*: 1. build `_cgo_.o`, which links (on linux_amd64 systems) to `/usr/local/go/src/runtime/race/race_linux_amd64.syso`. 2. That `.syso` contains references to symbols from libc. If the user program uses at least one libc symbol, it will link correctly. However, if Go is building a cgo executable, but without C code, the sections from `.syso` file will be garbage-collected, leaving a `_cgo_.o` without any references to libc, causing the final linking step to not link libc. Until now, this could be worked around by `-linkmode external` flag to `go build`. This causes Go to link the final executable using the external linker (which implicitly links libc). However, that flag brings in a whole different world of worms. I assume the `gc_sections` is an optimization; I tried to re-add `--gc-sections` to the final executable, but that didn't go well. I know removing such an optimization may be contentious, so let's start the discussion here. Quoting @andrewrk in [1] (it was about `--as-needed`, but the point remains the same): > The C source code for the temporary executable needs to have dummy > calls to getuid, pthread_self, sleep, and every other libc function > that cgo/race code wants to call. I agree this is how it *should* work. However, while we could fix it for go, I don't know how many other systems rely on that, and we'll never know we've fixed the last one. The point is, GCC/Clang does not optimize sections by default, and downstream tools rely on that. If we want to consider `zig cc` a drop-in clang replacement (except for `-fsanitize=undefined`, which I tend to agree with), then it should not be optimizing the intermediate object files. Or at least have a very prominent fine-print that this is happening, with ways to work around it. Fixes ziglang#11398 Fixes golang/go#44695 Fixes golang/go#52690 [*]: Empirically observed with `CGO_ENABLED=1 go test -race -x -v` [1]: golang/go#52690 (comment)
When building object files, `zig cc` will instruct lld to remove unused sections via `--gc-sections`. This is problematic cgo builds that don't explicitly use C code. Briefly, go builds cgo executables as follows*: 1. build `_cgo_.o`, which links (on linux_amd64 systems) to `/usr/local/go/src/runtime/race/race_linux_amd64.syso`. 2. That `.syso` contains references to symbols from libc. If the user program uses at least one libc symbol, it will link correctly. However, if Go is building a cgo executable, but without C code, the sections from `.syso` file will be garbage-collected, leaving a `_cgo_.o` without any references to libc, causing the final linking step to not link libc. Until now, this could be worked around by `-linkmode external` flag to `go build`. This causes Go to link the final executable using the external linker (which implicitly links libc). However, that flag brings in a whole different world of worms. I assume the `gc_sections` is an optimization; I tried to re-add `--gc-sections` to the final executable, but that didn't go well. I know removing such an optimization may be contentious, so let's start the discussion here. Quoting @andrewrk in [1] (it was about `--as-needed`, but the point remains the same): > The C source code for the temporary executable needs to have dummy > calls to getuid, pthread_self, sleep, and every other libc function > that cgo/race code wants to call. I agree this is how it *should* work. However, while we could fix it for go, I don't know how many other systems rely on that, and we'll never know we've fixed the last one. The point is, GCC/Clang does not optimize sections by default, and downstream tools rely on that. If we want to consider `zig cc` a drop-in clang replacement (except for `-fsanitize=undefined`, which I tend to agree with), then it should not be optimizing the intermediate object files. Or at least have a very prominent fine-print that this is happening, with ways to work around it. Fixes ziglang#11398 Fixes golang/go#44695 Fixes golang/go#52690 [*]: Empirically observed with `CGO_ENABLED=1 go test -race -x -v` [1]: golang/go#52690 (comment)
`zig cc` emits `--gc-sections` for the linker, which is incompatbile with what CGo thinks about linking. This commit adds a workaround: it will add `--no-gc-sections` to the linking step if the command is not specified (falling back to the default behavior of gcc/clang). Related: golang/go#52690
It would be fine with me if somebody wants to send a cgo patch that passes -Wl,--no-gc-sections, with a fallback if that option is not supported. |
I think that would work fine but, any reason not to take my suggestion of having the C source code for the temporary executable make a call to any libc function? Seems like that would be more technically robust and avoid introducing logic for detecting available linker options. |
In general we would need to call the function with the correct argument types, as compilers know about an increasing number of functions and will complain if they are misused. That might be kind of painful. But if someone can make that work, then, sure, that is fine with me. |
How about assigning the function addresses to a |
Worth a try. |
All you have to do is use one symbol referenced by the dynamic symbol table, adding |
zig cc passes `--gc-sections` to the underlying linker, which then causes undefined symbol errors when compiling with cgo but without C code. Add `-Wl,--no-gc-sections` to make it work with zig cc. Minimal example: **main.go** package main import _ "runtime/cgo" func main() {} Run (works after the patch, doesn't work before): CC="zig cc" go build main.go Among the existing code, `src/runtime/testdata/testprognet` fails to build: src/runtime/testdata/testprognet$ CC="zig cc" go build . net(.text): relocation target __errno_location not defined net(.text): relocation target getaddrinfo not defined net(.text): relocation target freeaddrinfo not defined net(.text): relocation target gai_strerror not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target fwrite not defined runtime/cgo(.text): relocation target vfprintf not defined runtime/cgo(.text): relocation target fputc not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_create not defined runtime/cgo(.text): relocation target nanosleep not defined runtime/cgo(.text): relocation target pthread_detach not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target strerror not defined runtime/cgo(.text): relocation target fprintf not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_mutex_lock not defined runtime/cgo(.text): relocation target pthread_cond_wait not defined runtime/cgo(.text): relocation target pthread_mutex_unlock not defined runtime/cgo(.text): relocation target pthread_cond_broadcast not defined runtime/cgo(.text): relocation target malloc not defined With the patch both examples build as expected. @ianlancetaylor suggested: > It would be fine with me if somebody wants to send a cgo patch that passes -Wl,--no-gc-sections, with a fallback if that option is not supported. ... and this is what we are doing. Tested with zig 0.10.0-dev.2252+a4369918b Fixes golang#52690
zig cc passes `--gc-sections` to the underlying linker, which then causes undefined symbol errors when compiling with cgo but without C code. Add `-Wl,--no-gc-sections` to make it work with zig cc. Minimal example: **main.go** package main import _ "runtime/cgo" func main() {} Run (works after the patch, doesn't work before): CC="zig cc" go build main.go Among the existing code, `src/runtime/testdata/testprognet` fails to build: src/runtime/testdata/testprognet$ CC="zig cc" go build . net(.text): relocation target __errno_location not defined net(.text): relocation target getaddrinfo not defined net(.text): relocation target freeaddrinfo not defined net(.text): relocation target gai_strerror not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target fwrite not defined runtime/cgo(.text): relocation target vfprintf not defined runtime/cgo(.text): relocation target fputc not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_create not defined runtime/cgo(.text): relocation target nanosleep not defined runtime/cgo(.text): relocation target pthread_detach not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target strerror not defined runtime/cgo(.text): relocation target fprintf not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_mutex_lock not defined runtime/cgo(.text): relocation target pthread_cond_wait not defined runtime/cgo(.text): relocation target pthread_mutex_unlock not defined runtime/cgo(.text): relocation target pthread_cond_broadcast not defined runtime/cgo(.text): relocation target malloc not defined With the patch both examples build as expected. @ianlancetaylor suggested: > It would be fine with me if somebody wants to send a cgo patch that passes -Wl,--no-gc-sections, with a fallback if that option is not supported. ... and this is what we are doing. Tested with zig 0.10.0-dev.2252+a4369918b Fixes golang#52690
As of CL 334732 `go build` can accept `$CC` with spaces and quotes, which lets us easily use `zig cc` as the C compiler, or easily pass extra compiler parameters: ``` CC="zig cc" go build <...> CC="clang-13 -v" go build <...> CC="zig cc -Wl,--print-gc-sections" go build <...> ``` However, the same does not apply for building go itself: ``` $ CC="zig cc" ./make.bash Building Go cmd/dist using /usr/local/go. (go1.18.2 linux/amd64) go tool dist: cannot invoke C compiler "zig cc": exec: "zig cc": executable file not found in $PATH Go needs a system C compiler for use with cgo. To set a C compiler, set CC=the-compiler. To disable cgo, set CGO_ENABLED=0. ``` With this change Go can be built directly with `zig cc` (the linker arg will disappear with golang#52815 and/or golang#52690): ``` $ CC="zig cc -Wl,--no-gc-sections" ./make.bash Building Go cmd/dist using /usr/local/go. (go1.18.2 linux/amd64) Building Go toolchain1 using /usr/local/go. Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1. Building Go toolchain2 using go_bootstrap and Go toolchain1. Building Go toolchain3 using go_bootstrap and Go toolchain2. Building packages and commands for linux/amd64. --- Installed Go for linux/amd64 in /home/motiejus/code/go Installed commands in /home/motiejus/code/go/bin $ ../bin/go version go version devel go1.19-811f1913a8 Thu May 19 09:44:49 2022 +0300 linux/amd64 ``` Fixes golang#52990
zig cc passes `--gc-sections` to the underlying linker, which then causes undefined symbol errors when compiling with cgo but without C code. Add `-Wl,--no-gc-sections` to make it work with zig cc. Minimal example: **main.go** package main import _ "runtime/cgo" func main() {} Run (works after the patch, doesn't work before): CC="zig cc" go build main.go Among the existing code, `src/runtime/testdata/testprognet` fails to build: src/runtime/testdata/testprognet$ CC="zig cc" go build . net(.text): relocation target __errno_location not defined net(.text): relocation target getaddrinfo not defined net(.text): relocation target freeaddrinfo not defined net(.text): relocation target gai_strerror not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target fwrite not defined runtime/cgo(.text): relocation target vfprintf not defined runtime/cgo(.text): relocation target fputc not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_create not defined runtime/cgo(.text): relocation target nanosleep not defined runtime/cgo(.text): relocation target pthread_detach not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target strerror not defined runtime/cgo(.text): relocation target fprintf not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_mutex_lock not defined runtime/cgo(.text): relocation target pthread_cond_wait not defined runtime/cgo(.text): relocation target pthread_mutex_unlock not defined runtime/cgo(.text): relocation target pthread_cond_broadcast not defined runtime/cgo(.text): relocation target malloc not defined With the patch both examples build as expected. @ianlancetaylor suggested: > It would be fine with me if somebody wants to send a cgo patch that passes -Wl,--no-gc-sections, with a fallback if that option is not supported. ... and this is what we are doing. Tested with zig 0.10.0-dev.2252+a4369918b This is a continuation of CL 405414: the original one broke AIX and iOS builds. To fix that, added `unknown option` to the list of strings under lookup. Fixes golang#52690
zig cc passes `--gc-sections` to the underlying linker, which then causes undefined symbol errors when compiling with cgo but without C code. Add `-Wl,--no-gc-sections` to make it work with zig cc. Minimal example: **main.go** package main import _ "runtime/cgo" func main() {} Run (works after the patch, doesn't work before): CC="zig cc" go build main.go Among the existing code, `src/runtime/testdata/testprognet` fails to build: src/runtime/testdata/testprognet$ CC="zig cc" go build . net(.text): relocation target __errno_location not defined net(.text): relocation target getaddrinfo not defined net(.text): relocation target freeaddrinfo not defined net(.text): relocation target gai_strerror not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target fwrite not defined runtime/cgo(.text): relocation target vfprintf not defined runtime/cgo(.text): relocation target fputc not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_create not defined runtime/cgo(.text): relocation target nanosleep not defined runtime/cgo(.text): relocation target pthread_detach not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target strerror not defined runtime/cgo(.text): relocation target fprintf not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_mutex_lock not defined runtime/cgo(.text): relocation target pthread_cond_wait not defined runtime/cgo(.text): relocation target pthread_mutex_unlock not defined runtime/cgo(.text): relocation target pthread_cond_broadcast not defined runtime/cgo(.text): relocation target malloc not defined With the patch both examples build as expected. @ianlancetaylor suggested: > It would be fine with me if somebody wants to send a cgo patch that passes -Wl,--no-gc-sections, with a fallback if that option is not supported. ... and this is what we are doing. Tested with zig 0.10.0-dev.2252+a4369918b This is a continuation of CL 405414: the original one broke AIX and iOS builds. To fix that, added `unknown option` to the list of strings under lookup. Fixes golang#52690
Change https://go.dev/cl/407814 mentions this issue: |
zig cc passes `--gc-sections` to the underlying linker, which then causes undefined symbol errors when compiling with cgo but without C code. Add `-Wl,--no-gc-sections` to make it work with zig cc. Minimal example: **main.go** package main import _ "runtime/cgo" func main() {} Run (works after the patch, doesn't work before): CC="zig cc" go build main.go Among the existing code, `src/runtime/testdata/testprognet` fails to build: src/runtime/testdata/testprognet$ CC="zig cc" go build . net(.text): relocation target __errno_location not defined net(.text): relocation target getaddrinfo not defined net(.text): relocation target freeaddrinfo not defined net(.text): relocation target gai_strerror not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target fwrite not defined runtime/cgo(.text): relocation target vfprintf not defined runtime/cgo(.text): relocation target fputc not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_create not defined runtime/cgo(.text): relocation target nanosleep not defined runtime/cgo(.text): relocation target pthread_detach not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target strerror not defined runtime/cgo(.text): relocation target fprintf not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_mutex_lock not defined runtime/cgo(.text): relocation target pthread_cond_wait not defined runtime/cgo(.text): relocation target pthread_mutex_unlock not defined runtime/cgo(.text): relocation target pthread_cond_broadcast not defined runtime/cgo(.text): relocation target malloc not defined With the patch both examples build as expected. @ianlancetaylor suggested: > It would be fine with me if somebody wants to send a cgo patch that passes -Wl,--no-gc-sections, with a fallback if that option is not supported. ... and this is what we are doing. Tested with zig 0.10.0-dev.2252+a4369918b This is a continuation of CL 405414: the original one broke AIX and iOS builds. To fix that, added `unknown option` to the list of strings under lookup. Fixes golang#52690
zig cc passes `--gc-sections` to the underlying linker, which then causes undefined symbol errors when compiling with cgo but without C code. Add `-Wl,--no-gc-sections` to make it work with zig cc. Minimal example: **main.go** package main import _ "runtime/cgo" func main() {} Run (works after the patch, doesn't work before): CC="zig cc" go build main.go Among the existing code, `src/runtime/testdata/testprognet` fails to build: src/runtime/testdata/testprognet$ CC="zig cc" go build . net(.text): relocation target __errno_location not defined net(.text): relocation target getaddrinfo not defined net(.text): relocation target freeaddrinfo not defined net(.text): relocation target gai_strerror not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target fwrite not defined runtime/cgo(.text): relocation target vfprintf not defined runtime/cgo(.text): relocation target fputc not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_create not defined runtime/cgo(.text): relocation target nanosleep not defined runtime/cgo(.text): relocation target pthread_detach not defined runtime/cgo(.text): relocation target stderr not defined runtime/cgo(.text): relocation target strerror not defined runtime/cgo(.text): relocation target fprintf not defined runtime/cgo(.text): relocation target abort not defined runtime/cgo(.text): relocation target pthread_mutex_lock not defined runtime/cgo(.text): relocation target pthread_cond_wait not defined runtime/cgo(.text): relocation target pthread_mutex_unlock not defined runtime/cgo(.text): relocation target pthread_cond_broadcast not defined runtime/cgo(.text): relocation target malloc not defined With the patch both examples build as expected. @ianlancetaylor suggested: > It would be fine with me if somebody wants to send a cgo patch that passes -Wl,--no-gc-sections, with a fallback if that option is not supported. ... and this is what we are doing. Tested with zig 0.10.0-dev.2252+a4369918b This is a continuation of CL 405414: the original one broke AIX and iOS builds. To fix that, added `unknown option` to the list of strings under lookup. Fixes golang#52690
I still see this issue with Go 1.20 and zig
|
@uhthomas Those are different error messages (the details matter). We believe that this issue is fixed. Please open a new issue for the new problem. Thanks. |
Original examples:
What I am observing:
Would you be able to help me understand how this is not the same issue? It looks like it's just different packages and symbols? It may be helpful to know that this does not happen with Zig |
@uhthomas That's exactly right: it's different packages and symbols. In particular it's not about runtime/race any more. I am surprised that this changes based on the version of Zig, though. I have no explanation for that. |
I fixed the memset/memcpy in zig upstream. A fix for res_search and a few more symbols is now developed for upstream zig by my colleague. |
Would you be able to link the memset/memcpy fix here? Was this recent? @motiejus |
Sure, here it is: ziglang/zig@b3f4e0d I attempted to workaround res_search issue it in bazel-zig-cc (https://git.sr.ht/~motiejus/bazel-zig-cc/commit/7b0de33070bef14265d7ec560fca43f5e132eea4 and https://git.sr.ht/~motiejus/bazel-zig-cc/commit/8d1e1c9fa66712c8f7da3634990ab4ccd2aa47c9), but that doesn't always work, as you can see. We agreed with @andrewrk that we will change the offending headers in upsteam zig (add ifdefs on glibc version). To my latest knowledge, @sywhang is working on that. |
Corresponding Zig issue: ziglang/zig#11398
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
I only tested with go1.18.1.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
go.mod
:foo.go
:foo_test.go
:Then use zig cc as the C toolchain for
go test
with race detection:Zig Version
0.10.0-dev.2052+3cfde183f
What did you expect to see?
This should work, as it does without
-race
.What did you see instead?
We get errors from the Go linker:
Notes
Related issue: #44695
It appears that cgo has asked the C compiler to compile C source files which contain dependencies on libc symbols, however the Go linker is not actually putting libc onto the linker line, causing these undefined symbol errors.
It does compile and run successfully using
-linkmode external
, however, I don't see why it shouldn't work with the Go linker as well, given thatzig cc
produces standard ELF object files (same as Clang and GCC).It looks to me like the fix is simple: the Go linker needs to put libc onto the linker line when it is linking objects that contain libc dependencies.
The text was updated successfully, but these errors were encountered: