Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo run causes linking problems (dynamic and static) on i686-pc-windows-gnu #8990

Open
clauswilke opened this issue Dec 17, 2020 · 25 comments
Labels
C-bug Category: bug O-windows OS: Windows S-triage Status: This issue is waiting on initial triage.

Comments

@clauswilke
Copy link

In the context of generating Rust bindings for R via bindgen (libR-sys project, https://github.com/extendr/libR-sys), we have encountered a strange problem for the target i686-pc-windows-gnu that seems to trace back to something cargo run does when calling an executable it has just built.

In brief, we noticed that when trying to run our bindgen build script on 32-bit Windows, the program would abort with a linking error in libclang. We first thought it was a packaging problem with the mingw package providing libclang, and we reported it as such: msys2/MINGW-packages#7442. However, further investigation has shown that this is unlikely to be the case. Instead, we now suspect there's a problem with cargo itself.

@Ilia-Kosenkov has created a minimal example of a Rust program calling libclang, to mimic what bindgen does:
https://github.com/Ilia-Kosenkov/rust_clang_run. This program compiles and runs just fine on 64-bit Windows, and it aborts with a linking error on 32-bit Windows, regardless of whether libclang is linked statically or dynamically. But here's the kicker: This happens only when we run the program with cargo run. If we build the program with cargo build and then run the executable by calling it directly, everything works fine. You can see this behavior in the GitHub Actions for the minimal example, e.g. here: https://github.com/Ilia-Kosenkov/rust_clang_run/runs/1565696019?check_suite_focus=true

In the minimal example, we can of course work around the problem by not using cargo run. But in the actual application, libclang is linked into the build script, and thus the compiled executable is always run by cargo and the linking error occurs.

Related libR-sys issue with discussion and background: extendr/libR-sys#9

Note: We're using the i686-pc-windows-gnu target because that's a hard requirement set by the R project. We cannot use i686-pc-windows-msvc as a workaround, and we have not tested whether the problem exists on i686-pc-windows-msvc.

@alexcrichton
Copy link
Member

Thanks for the report! I suspect though that that this may be an issue for the compiler itself rather than Cargo? Cargo seems like it may be unlikely to be too involved in what's going on here since it's just calling rustc perhaps?

@clauswilke
Copy link
Author

I don't know what happens under the hood when I enter cargo run, so it may be a compiler issue. I'd just want to emphasize one more time that the executable that is generated is fine and runs, as long as it isn't started via cargo run.

@clauswilke
Copy link
Author

If you could point us to the relevant code in Cargo that starts an executable we could dig a little deeper and see if we can narrow down the culprit further.

@alexcrichton
Copy link
Member

Ah sorry I missed that!

In that case it does indeed sound like a Cargo issue. Cargo will adjust PATH for native libraries built as part of the build since that's how dynamic libraries are found on Windows. Perhaps those adjustments are causing the wrong dll to get loaded at some point?

I'm not entirely sure if we have a great location to point to since the adjustment for the target process command happens in a few locations, but for this sort of error it seems like the PATH adjustments are probably causing the issues?

@clauswilke
Copy link
Author

Maybe. I don't know anything about Windows, so I'm flying blind here, but @Ilia-Kosenkov has been very good at drilling down and trying to find what exactly goes wrong.

Here is a WinDbg log from a case where the problem occurs (previously posted here):

ModLoad: 00000000`78ed0000 00000000`7a912000   C:\tools\msys64\mingw32\bin\libclang.dll
ModLoad: 00000000`72870000 00000000`72878000   C:\Windows\SysWOW64\VERSION.dll
ModLoad: 00000000`6eb40000 00000000`6eb62000   C:\Users\[redacted]\.rustup\toolchains\nightly-i686-pc-windows-gnu\bin\libgcc_s_dw2-1.dll
ModLoad: 00000000`64b40000 00000000`64b53000   C:\Users\[redacted]\.rustup\toolchains\nightly-i686-pc-windows-gnu\bin\libwinpthread-1.dll
ModLoad: 00000000`038a0000 00000000`08b47000   C:\tools\msys64\mingw32\bin\libLLVM.dll
ModLoad: 00000000`50480000 00000000`5065a000   C:\tools\msys64\mingw32\bin\libstdc++-6.dll
ModLoad: 00000000`038a0000 00000000`08b47000   C:\tools\msys64\mingw32\bin\libLLVM.dll
ModLoad: 00000000`752c0000 00000000`753b7000   C:\Windows\SysWOW64\ole32.dll
ModLoad: 00000000`76890000 00000000`76b05000   C:\Windows\SysWOW64\combase.dll
ModLoad: 00000000`755e0000 00000000`75601000   C:\Windows\SysWOW64\GDI32.dll
ModLoad: 00000000`74ec0000 00000000`74ed7000   C:\Windows\SysWOW64\win32u.dll
ModLoad: 00000000`757e0000 00000000`7593c000   C:\Windows\SysWOW64\gdi32full.dll
ModLoad: 00000000`754e0000 00000000`7555c000   C:\Windows\SysWOW64\msvcp_win.dll
ModLoad: 00000000`76b10000 00000000`76ca8000   C:\Windows\SysWOW64\USER32.dll
ModLoad: 00000000`75a80000 00000000`75ffb000   C:\Windows\SysWOW64\SHELL32.dll
ModLoad: 00000000`75760000 00000000`7579b000   C:\Windows\SysWOW64\cfgmgr32.dll
ModLoad: 00000000`74dd0000 00000000`74e54000   C:\Windows\SysWOW64\shcore.dll
ModLoad: 00000000`762c0000 00000000`76881000   C:\Windows\SysWOW64\windows.storage.dll
ModLoad: 00000000`74fc0000 00000000`75003000   C:\Windows\SysWOW64\powrprof.dll
ModLoad: 00000000`75940000 00000000`7594d000   C:\Windows\SysWOW64\UMPDC.dll
ModLoad: 00000000`74cc0000 00000000`74d04000   C:\Windows\SysWOW64\shlwapi.dll
ModLoad: 00000000`76250000 00000000`7625f000   C:\Windows\SysWOW64\kernel.appcore.dll
ModLoad: 00000000`757a0000 00000000`757b3000   C:\Windows\SysWOW64\cryptsp.dll
ModLoad: 00000000`71240000 00000000`71250000   C:\tools\msys64\mingw32\bin\libffi-7.dll
ModLoad: 00000000`63080000 00000000`6309e000   C:\tools\msys64\mingw32\bin\zlib1.dll
(4424.c44): Unknown exception - code c0000139 (first chance)
ModLoad: 00000000`73d80000 00000000`73f0f000   C:\Windows\SysWOW64\dbghelp.dll
wow64cpu!CpupSyscallStub+0xc:
00000000`77071cbc c3              ret

Not sure if the paths look correct or not.

@Ilia-Kosenkov Could you regenerate this log for your simple example program, and could you also generate a comparable log for when the linking succeeds? Maybe we can pinpoint a path difference or a difference in the libraries that get loaded.

@mati865
Copy link
Contributor

mati865 commented Dec 17, 2020

I'm not entirely sure if we have a great location to point to since the adjustment for the target process command happens in a few locations, but for this sort of error it seems like the PATH adjustments are probably causing the issues?

It's quite likely.

Maybe system libraries get mess-up when running 32-bit cargo on 64-bit OS due to SysWOW64 thing? Guess nobody has 32-bit OS to test these days?

@Ilia-Kosenkov
Copy link

Ilia-Kosenkov commented Dec 17, 2020

Here is part of the output of WinDbgX.
I first compiled the project using
cargo build --release --target=i686-pc-windows-gnu --features runtime
Then attached a debugger to cargo run ...

ModLoad: 59ce0000 5b722000   C:\tools\msys64\mingw32\bin\libclang.dll
ModLoad: 00000000`72bd0000 00000000`72bd8000   C:\Windows\SysWOW64\VERSION.dll
ModLoad: 00000000`64b40000 00000000`64b53000   C:\Users\[redacted]\.rustup\toolchains\nightly-i686-pc-windows-gnu\bin\libwinpthread-1.dll
ModLoad: 00000000`6eb40000 00000000`6eb62000   C:\Users\[redacted]\.rustup\toolchains\nightly-i686-pc-windows-gnu\bin\libgcc_s_dw2-1.dll
ModLoad: 00000000`54850000 00000000`54a2a000   C:\tools\msys64\mingw32\bin\libstdc++-6.dll
ModLoad: 00000000`54a30000 00000000`59cd7000   C:\tools\msys64\mingw32\bin\libLLVM.dll
ModLoad: 00000000`77330000 00000000`77427000   C:\Windows\SysWOW64\ole32.dll
ModLoad: 00000000`75400000 00000000`75675000   C:\Windows\SysWOW64\combase.dll
ModLoad: 00000000`76650000 00000000`76671000   C:\Windows\SysWOW64\GDI32.dll
ModLoad: 00000000`76ed0000 00000000`76ee7000   C:\Windows\SysWOW64\win32u.dll
ModLoad: 00000000`757d0000 00000000`7592c000   C:\Windows\SysWOW64\gdi32full.dll
ModLoad: 00000000`77430000 00000000`774ac000   C:\Windows\SysWOW64\msvcp_win.dll
ModLoad: 00000000`77110000 00000000`772a8000   C:\Windows\SysWOW64\USER32.dll
ModLoad: 00000000`74e80000 00000000`753fb000   C:\Windows\SysWOW64\SHELL32.dll
ModLoad: 00000000`74cb0000 00000000`74ceb000   C:\Windows\SysWOW64\cfgmgr32.dll
ModLoad: 00000000`74dc0000 00000000`74e44000   C:\Windows\SysWOW64\shcore.dll
ModLoad: 00000000`75b10000 00000000`760d1000   C:\Windows\SysWOW64\windows.storage.dll
ModLoad: 00000000`760e0000 00000000`76123000   C:\Windows\SysWOW64\powrprof.dll
ModLoad: 00000000`75a50000 00000000`75a5d000   C:\Windows\SysWOW64\UMPDC.dll
ModLoad: 00000000`74d70000 00000000`74db4000   C:\Windows\SysWOW64\shlwapi.dll
ModLoad: 00000000`76f60000 00000000`76f6f000   C:\Windows\SysWOW64\kernel.appcore.dll
ModLoad: 00000000`77310000 00000000`77323000   C:\Windows\SysWOW64\cryptsp.dll
ModLoad: 00000000`71240000 00000000`71250000   C:\tools\msys64\mingw32\bin\libffi-7.dll
ModLoad: 00000000`63080000 00000000`6309e000   C:\tools\msys64\mingw32\bin\zlib1.dll
(222c.2ef0): Unknown exception - code c0000139 (first chance)
wow64cpu!CpupSyscallStub+0xc:
00000000`774b1cbc c3              ret

This looks like the problematic fragment, as it starts with loading libclan and ends with c0000139 error, which is ENTRY_POINT_NOT_FOUND that we get.
After this error, all libraries are being unloaded and execution terminates.

For comparison, this is what happens if I just debug the produced executable:

ModLoad: 58410000 59e52000   C:\tools\msys64\mingw32\bin\libclang.dll
ModLoad: 72bd0000 72bd8000   C:\Windows\SysWOW64\VERSION.dll
ModLoad: 583e0000 58406000   C:\tools\msys64\mingw32\bin\libgcc_s_dw2-1.dll
ModLoad: 53110000 53128000   C:\tools\msys64\mingw32\bin\libwinpthread-1.dll
ModLoad: 52f30000 5310a000   C:\tools\msys64\mingw32\bin\libstdc++-6.dll
ModLoad: 53130000 583d7000   C:\tools\msys64\mingw32\bin\libLLVM.dll
ModLoad: 77330000 77427000   C:\Windows\SysWOW64\ole32.dll
ModLoad: 75400000 75675000   C:\Windows\SysWOW64\combase.dll
ModLoad: 76650000 76671000   C:\Windows\SysWOW64\GDI32.dll
ModLoad: 76ed0000 76ee7000   C:\Windows\SysWOW64\win32u.dll
ModLoad: 757d0000 7592c000   C:\Windows\SysWOW64\gdi32full.dll
ModLoad: 77430000 774ac000   C:\Windows\SysWOW64\msvcp_win.dll
ModLoad: 77110000 772a8000   C:\Windows\SysWOW64\USER32.dll
ModLoad: 74e80000 753fb000   C:\Windows\SysWOW64\SHELL32.dll
ModLoad: 74cb0000 74ceb000   C:\Windows\SysWOW64\cfgmgr32.dll
ModLoad: 74dc0000 74e44000   C:\Windows\SysWOW64\shcore.dll
ModLoad: 75b10000 760d1000   C:\Windows\SysWOW64\windows.storage.dll
ModLoad: 760e0000 76123000   C:\Windows\SysWOW64\powrprof.dll
ModLoad: 75a50000 75a5d000   C:\Windows\SysWOW64\UMPDC.dll
ModLoad: 74d70000 74db4000   C:\Windows\SysWOW64\shlwapi.dll
ModLoad: 76f60000 76f6f000   C:\Windows\SysWOW64\kernel.appcore.dll
ModLoad: 77310000 77323000   C:\Windows\SysWOW64\cryptsp.dll
ModLoad: 71240000 71250000   C:\tools\msys64\mingw32\bin\libffi-7.dll
ModLoad: 63080000 6309e000   C:\tools\msys64\mingw32\bin\zlib1.dll
ModLoad: 74e50000 74e75000   C:\Windows\SysWOW64\IMM32.DLL
eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=775dda60
eip=775331dc esp=00dffd1c ebp=00dffdf0 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
ntdll!NtTerminateProcess+0xc:
775331dc c20800          ret     8

Again, I am not WinDbg guru, but if needed, I can try digging deeper.
Note that this is executed on a real system used every day, so PATH is contaminated by other apps.

So far the only difference I see is that cargo uses two libraries from its toolchain, while executable uses correct mingw32 libs.

@clauswilke
Copy link
Author

@Ilia-Kosenkov Could you check whether the same difference exists on the 64-bit system? It probably does but doesn't cause problems there. Maybe the two libraries that get pulled in from the toolchain have a conflict with zlib1.dll provided by msys2 on i686.

In general it makes me uncomfortable to mix libraries from different sources (rust toolchain vs msys2), but maybe there have been other bugs in the past that required this sort of behavior?

@Ilia-Kosenkov
Copy link

@clauswilke,
Actually, I am unable to, as the WinDbg output of 64-bit cargo run is drastically different from what I posted above, and I have to yet learn how to navigate this debugger.

It feels like cargo does something completely different when doing 64 bit compared to 32 bit.
For instance, there is no libclang among pulled libraries that I see, may be it runs executable out of process somehow?

However, if I debug just the executable, I see similar libraries referenced (notice that I have both mingw32 and mingw64 on my PATH, yet still correct libraries are found):

ModLoad: 00007ff8`d9830000 00007ff8`db093000   C:\tools\msys64\mingw64\bin\libclang.dll
ModLoad: 00007ff9`52880000 00007ff9`5288a000   C:\Windows\SYSTEM32\VERSION.dll
ModLoad: 00000000`6bf00000 00000000`711a7000   C:\tools\msys64\mingw32\bin\libLLVM.dll
ModLoad: 00000000`5c480000 00000000`5c498000   C:\tools\msys64\mingw32\bin\libwinpthread-1.dll
ModLoad: 00000000`571b0000 00000000`5738a000   C:\tools\msys64\mingw32\bin\libstdc++-6.dll
ModLoad: 00007ff9`4b860000 00007ff9`4b876000   C:\tools\msys64\mingw64\bin\libwinpthread-1.dll
ModLoad: 00007ff8`d4920000 00007ff8`d9824000   C:\tools\msys64\mingw64\bin\libLLVM.dll
ModLoad: 00007ff9`051d0000 00007ff9`05381000   C:\tools\msys64\mingw64\bin\libstdc++-6.dll
ModLoad: 00007ff9`59ec0000 00007ff9`5a017000   C:\Windows\System32\ole32.dll
ModLoad: 00007ff9`5a4d0000 00007ff9`5a805000   C:\Windows\System32\combase.dll
ModLoad: 00007ff9`5a050000 00007ff9`5a076000   C:\Windows\System32\GDI32.dll
ModLoad: 00007ff9`58600000 00007ff9`58621000   C:\Windows\System32\win32u.dll
ModLoad: 00007ff9`58820000 00007ff9`589b6000   C:\Windows\System32\gdi32full.dll
ModLoad: 00000000`63080000 00000000`6309e000   C:\tools\msys64\mingw32\bin\zlib1.dll
ModLoad: 00007ff9`58630000 00007ff9`586ce000   C:\Windows\System32\msvcp_win.dll
ModLoad: 00007ff9`5a810000 00007ff9`5a9a4000   C:\Windows\System32\USER32.dll
ModLoad: 00007ff9`594d0000 00007ff9`59bb7000   C:\Windows\System32\SHELL32.dll
ModLoad: 00007ff9`58300000 00007ff9`5834a000   C:\Windows\System32\cfgmgr32.dll
ModLoad: 00007ff9`58d90000 00007ff9`58e39000   C:\Windows\System32\shcore.dll
ModLoad: 00007ff9`57a40000 00007ff9`581c1000   C:\Windows\System32\windows.storage.dll
ModLoad: 00000000`62e80000 00000000`62e9f000   C:\tools\msys64\mingw64\bin\zlib1.dll
ModLoad: 00007ff9`579d0000 00007ff9`57a1a000   C:\Windows\System32\powrprof.dll
ModLoad: 00007ff9`57970000 00007ff9`57980000   C:\Windows\System32\UMPDC.dll
ModLoad: 00007ff9`5a300000 00007ff9`5a352000   C:\Windows\System32\shlwapi.dll
ModLoad: 00007ff9`57a20000 00007ff9`57a31000   C:\Windows\System32\kernel.appcore.dll
ModLoad: 00007ff9`58b50000 00007ff9`58b67000   C:\Windows\System32\cryptsp.dll
ModLoad: 00000000`71240000 00000000`71250000   C:\tools\msys64\mingw32\bin\libffi-7.dll
ModLoad: 00000000`71040000 00000000`71051000   C:\tools\msys64\mingw64\bin\libffi-7.dll
ModLoad: 000001f6`5eba0000 000001f6`5ebbc000   C:\tools\msys64\mingw64\bin\libgcc_s_seh-1.dll
ModLoad: 000001f6`5ebc0000 000001f6`5ebdc000   C:\tools\msys64\mingw64\bin\libgcc_s_seh-1.dll
ModLoad: 00007ff9`35b70000 00007ff9`35b8c000   C:\tools\msys64\mingw64\bin\libgcc_s_seh-1.dll
ModLoad: 00007ff9`5a020000 00007ff9`5a04e000   C:\Windows\System32\IMM32.DLL
ntdll!NtTerminateProcess+0x14:
00007ff9`5ab5cc74 c3              ret

@mati865
Copy link
Contributor

mati865 commented Dec 18, 2020

Rust ships mingw-w64 GCC libs from version 7.X while MSYS2 is already at 10.2.

Could you try on the CI to run this command after installing Rust? (posted from my mobile so there can a typo).
rustup component remove rust-mingw

@Ilia-Kosenkov
Copy link

@mati865
Copy link
Contributor

mati865 commented Dec 18, 2020

@Ilia-Kosenkov can you try replacing Rust shipped library with ones provided by MSYS2 (bash syntax):

cp <path_to_msys2>/mingw32/bin/{libwinpthread-1.dll,libgcc_s_dw2-1.dll} `rustc --print sysroot`/bin

@clauswilke
Copy link
Author

@mati865 Should we be able to avoid this problem by not using rustup at all and instead installing rustc and cargo via MSYS2? I'm also running into other linking problems that seem to be caused by interference between the two toolchains, and it seems to me not having two toolchains at all would be best.

@alexcrichton
Copy link
Member

Yeah if it's toolchain differences than a Rust msys2 package, if one exists, likely solves this since it'll be guaranteed to use the same version of gcc you have on your system (and won't have the bundled copy like Rust releases). The only way I know of around this on Windows is to use an MSVC toolchain to cross-compile to the MinGW target.

@Ilia-Kosenkov
Copy link

@mati865,
Sorry for the delay. Your solution worked -- I moved mingw libs to rust toolchain folder and now cargo run ... finishses successfully.
I have tested this so far only on my machine, I will reproduce this in GHA shortly.

@Ilia-Kosenkov
Copy link

@alexcrichton, @mati865, @clauswilke,
I reproduced all steps in GHA. Both solutions work -- either swapping in mingw32 libs or using stable-msvc toolchain to compile targeting i686-pc-windows-gnu.
This is the latest GHA report, the last job uses msvc, other i686 jobs replace libraries -- evrything is green.

@mati865
Copy link
Contributor

mati865 commented Dec 19, 2020

@clauswilke MSYS2 Rust package is not always up to date and the main reason for it is packaging Rust crates into MSYS2 repo.

Since rust-lang/rust#76167 there is no conflict between Rust shipped and system mingw-w64 toolchains. You can test it by removing toolchain shipped by Rust ( rustup component remove rust-mingw).
The conflict here comes from runtime libraries.

The only way I know of around this on Windows is to use an MSVC toolchain to cross-compile to the MinGW target.

I think cross-compiling from x86_64-gnu to i686-gnu should also work.

When it comes to fixing:
I'm not sure if it's possible for Cargo to avoid adding <sysroot>/bin to the beginning of PATH when running the binaries since crates using rustc-dev component could depend on it.
Statically linking libgcc (to avoid shipping it as a DLL) would very likely break rustc own backtraces (backtraces in crates compiled with rustc would still work).
Updating mingw-w64 used to build Rust would workaround the issue here, I have some WIP but it's still long time from completion due to lack of time.

What Cargo (or maybe bindgen?) could do to fix this issue is copying (hard-linking if possible?) runtime dependencies next to created DLL. That would work since Windows first looks for the libraries next to your DLL/executable then searches PATH.

@clauswilke
Copy link
Author

@mati865 If I try to build an R package using Rust with x86_64-gnu as default host I get a linker error:

C:/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: cannot find -lgcc_eh

See e.g. here: https://github.com/clauswilke/extendr/runs/1578651175?check_suite_focus=true#step:8:145

This goes away when I use x86_64-msvc. Note that this build process pulls in yet another toolchain, the Rtools tool chain which ships its own version of mingw64. I'm not sure how all of this fits together.

@mati865
Copy link
Contributor

mati865 commented Jan 4, 2021

For some unknown to me reason it tried to build winapi-x86_64-pc-windows-gnu but there was no x86_64 mingw-w64 toolchain in the path?

@Ilia-Kosenkov

This comment has been minimized.

@mati865
Copy link
Contributor

mati865 commented Jan 15, 2021

@Ilia-Kosenkov sounds like you don't have i686 linker in PATH:

$ which i686-w64-mingw32-gcc
/mingw32/bin/i686-w64-mingw32-gcc

@Ilia-Kosenkov

This comment has been minimized.

@mati865
Copy link
Contributor

mati865 commented Jan 15, 2021

@Ilia-Kosenkov that is right, windows-gnu hosts ship native linker but don't do that for the targets. It's user responsibility to provide linker when cross-compiling.

I'd suggest moving this discussion somewhere else to avoid off-topic in this issue.

@mati865
Copy link
Contributor

mati865 commented Aug 18, 2021

I think there is nothing to do from Cargo perspective, if there are still issues they should be reported to main Rust repository.

@Ilia-Kosenkov
Copy link

@mati865, while there can still be some issues with cross-compilation (or mingw toolchain) on Windows, we were able to find a way to bypass these restrictions in our project, and we are happy with that.
Feel free to close this issue.

Thank you for the discussion, it helped us a lot.

@epage epage added the S-triage Status: This issue is waiting on initial triage. label Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug O-windows OS: Windows S-triage Status: This issue is waiting on initial triage.
Projects
None yet
Development

No branches or pull requests

6 participants