-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: enable multicore-sdr by default; allow env var to disable #182
Conversation
Note, this is a bad title as it still requires |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow this makes the Go linker not link to hwloc, using prebuilt binaries I'm getting
go build -ldflags="-X=github.com/filecoin-project/lotus/build.CurrentCommit=+git.3b681c84d" -o lotus ./cmd/lotus
# github.com/filecoin-project/lotus/cmd/lotus
/usr/lib/go/pkg/tool/linux_amd64/link: running gcc failed: exit status 1
/bin/ld: /home/magik6k/github.com/filecoin-project/go-lotus/extern/filecoin-ffi/libfilcrypto.a(hwloc-1ab47d5704e4f2bf.hwloc.dkhqznxr-cgu.0.rcgu.o): in function `core::ptr::drop_in_place<hwloc::bitmap::Bitmap>':
hwloc.dkhqznxr-cgu.0:(.text._ZN4core3ptr42drop_in_place$LT$hwloc..bitmap..Bitmap$GT$17h50092dae1f42cf1fE+0xb): undefined reference to `hwloc_bitmap_free'
/bin/ld: /home/magik6k/github.com/filecoin-project/go-lotus/extern/filecoin-ffi/libfilcrypto.a(hwloc-1ab47d5704e4f2bf.hwloc.dkhqznxr-cgu.0.rcgu.o): in function `hwloc::Topology::new':
hwloc.dkhqznxr-cgu.0:(.text._ZN5hwloc8Topology3new17h1b1054795ca46179E+0x11): undefined reference to `hwloc_topology_init'
/bin/ld: hwloc.dkhqznxr-cgu.0:(.text._ZN5hwloc8Topology3new17h1b1054795ca46179E+0x1b): undefined reference to `hwloc_topology_load'
/bin/ld: hwloc.dkhqznxr-cgu.0:(.text._ZN5hwloc8Topology3new17h1b1054795ca46179E+0x25): undefined reference to `hwloc_topology_get_support'
/bin/ld: /home/magik6k/github.com/filecoin-project/go-lotus/extern/filecoin-ffi/libfilcrypto.a(hwloc-1ab47d5704e4f2bf.hwloc.dkhqznxr-cgu.0.rcgu.o): in function `hwloc::Topology::with_flags':
hwloc.dkhqznxr-cgu.0:(.text._ZN5hwloc8Topology10with_flags17h7d088a384b896636E+0xb5): undefined reference to `hwloc_topology_init'
/bin/ld: hwloc.dkhqznxr-cgu.0:(.text._ZN5hwloc8Topology10with_flags17h7d088a384b896636E+0xc2): undefined reference to `hwloc_topology_set_flags'
/bin/ld: hwloc.dkhqznxr-cgu.0:(.text._ZN5hwloc8Topology10with_flags17h7d088a384b896636E+0xcc): undefined reference to `hwloc_topology_load'
/bin/ld: hwloc.dkhqznxr-cgu.0:(.text._ZN5hwloc8Topology10with_flags17h7d088a384b896636E+0xd6): undefined reference to `hwloc_topology_get_support'
...
(Same happens when built with FFI_BUILD_FROM_SOURCE=1) |
That could be issue then. As (at least the rust-fil-proofs README says) multicore won't work with hwlock. I'll have a closer look. |
@magik6k it seems to be related to your environment. It passed on CI, it compiles on my local machine, a remote machine and also for some miners. If anyone wants to try to reproduce. Checkout https://github.com/filecoin-project/lotus/ and then:
|
It's also reproducible for @magik6k with:
|
@magik6k found out that there is a difference in linking. Master statically links, this PR dynamically links hwloc. Update: that's actually not true. Lotus v1.10-rc1: $ ldd lotus
linux-vdso.so.1 (0x00007fffc61c4000)
libhwloc.so.15 => /usr/lib/libhwloc.so.15 (0x00007f3b07c57000)
libOpenCL.so.1 => /opt/cuda/lib64/libOpenCL.so.1 (0x00007f3b07a50000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f3b07a36000)
libutil.so.1 => /usr/lib/libutil.so.1 (0x00007f3b07a31000)
librt.so.1 => /usr/lib/librt.so.1 (0x00007f3b07a26000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f3b07a05000)
libm.so.6 => /usr/lib/libm.so.6 (0x00007f3b078be000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f3b078b7000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007f3b076ea000)
/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f3b07ccc000)
libudev.so.1 => /usr/lib/libudev.so.1 (0x00007f3b076c2000) Lotus with this PR applied: $ ldd lotus linux-vdso.so.1 (0x00007ffca59a1000)
libhwloc.so.15 => /usr/lib/libhwloc.so.15 (0x00007fdb0199e000)
libOpenCL.so.1 => /opt/cuda/lib64/libOpenCL.so.1 (0x00007fdb01797000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fdb0177d000)
libutil.so.1 => /usr/lib/libutil.so.1 (0x00007fdb01778000)
librt.so.1 => /usr/lib/librt.so.1 (0x00007fdb0176d000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fdb0174c000)
libm.so.6 => /usr/lib/libm.so.6 (0x00007fdb01605000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fdb015fe000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007fdb01431000)
/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fdb01a13000)
libudev.so.1 => /usr/lib/libudev.so.1 (0x00007fdb01409000) |
I was able to reproduce once, but it was in a checkout I constantly switched to other branches. I cannot reproduce it on clean checkouts. Hence I'd like to ask for help. I've created this one-liner. Please run it and see if it fails with the error message above.
If it works and |
@magik6k: I can reproduce it. It happens if you build current master first and try this PR:
This means that this PR is good on a fresh checkout. It also means that the |
I looked if there's a difference after doing a I think investigating in why |
I think it is caused by golang/go#24355 |
In essence, if headers or archive change but the I'm testing this hypothesis right now. |
Possibly works around golang/go#24355 Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai>
Can confirm this works for me now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified that:
- It builds for me
- Multicore SDR works in pond
No description provided.