-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release 5.0.4 failed to build on GitHub Actions macOS 14.5 arm64 #12693
Comments
The 5.x is missing 7c5ef48 |
Backport #12694 |
Fix merged. Will be included for the next release. But this bug is annoying - recently we had a few snags on MacOS. I wonder if/how we should increase our CI coverage for MacOS(x86 + arm). |
@wenduwan Does this mean a 5.0.5 in the immediate future? |
@jsquyres I'm not sure. We are currently targeting 10/18. There are other critical bugfixes as well, and we can do another release once we get them in. |
My comment was incorrect, I was talking about the 5.x. Let me go and fix it. |
@wenduwan 5.0.4 fails to build on MacOS out of the box. Doesn't that qualify as an "oh crap!" and mandate an immediate 5.0.5? |
@jsquyres I need guidance for this. It's a build failure, meaning packagers (MacOS + ARM) are affected; on the other hand, end users won't be affected, i.e. no runtime error. Do we consider this to warrant an immediate release? |
I also want to hear from @dalcinl what is the impact to you and your user base? |
Well, building on my Mac/M2, I see the following warnings that I don't grok (may be some missing lines as I only keep the stderr output): configure: WARNING: -g has been added to CFLAGS (--enable-debug)
configure: WARNING: Could not find pmixcc
configure: WARNING: Your PMIx version is either does not
configure: WARNING: the capabilities feature or does not
configure: WARNING: include the PMIX_CAP_BASE capability flag
configure: WARNING: Ignoring this for now
...
configure: WARNING: UCX version is too old, please upgrade to 1.9 or higher. There is also a flood of warnings out of OMPI itself, but I'll ignore those for now. However, it [edit: shouldn't have said "built just fine" given all the warnings] "successfully built", so I suspect the failure involves some specific set of aux libraries that activate components, or something else specific to the environment. |
As far as I know UCX does not build on OSX. How does it find one installed on your M2 ? |
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
I have no idea! Certainly nothing I would have installed. |
Similar to @dalcinl's report, the Open MPI v5.0.4 tarball fails to compile for me out of the box on my MacOS Sonoma 14.5 M2 Pro:
This is a deal-breaker; v5.0.4 is a busted release. If you have a software package that fails to compile on a major platform, that's a non-starter. It should never have been released. Granted, real HPC jobs are not typically run on laptops, but macOS is a popular development and debugging platform -- so this is important. I do not buy the argument that Open MPI is only installed via packagers; I think we have a lot of users who download and build Open MPI from source. |
Hmmm...that's really weird. I have the exact same machine, same OS version - and it builds for me. I wonder why you are building code elements that don't get built on my machine (or maybe they just successfully build on mine)? No opinion on your conclusion - just curious as to the difference in behavior and what that might portend, especially combined with the strange warnings I saw. |
FWIW: I downloaded the 5.0.4 tarball (not a git clone) and configured with:
|
@wenduwan I managed to update my build scripts to include support for patches. Afterwards, I managed to build Python wheels successfully: https://anaconda.org/mpi4py/openmpi/files?version=5.0.4. This issue will eventually hit conda-forge, https://github.com/conda-forge/openmpi-feedstock, but the fix is trivial, just patching sources. Unless users want to build from source, they are otherwise not affected, it is only maintainers and distributors that have to deal with the compile issue. |
Historically, the vast majority of OMPI installations have been done from source - and not installed via package. Not saying it couldn't have changed, but that's what we've seen.
That may be the difference - I just checked out v5.0.4 in my git clone. Somewhat odd that this made a difference, though, as the two should be the same - unless the tag is wrong? Personally, I treat such instances as a busted release, but I also temper it a bit. I rarely do an immediate re-release, but do move up the next release date to be a little sooner. Reason: I don't see any reports of mass suicides or illnesses as a result of having to wait another month or two for a software release on a particular platform. So if it works for the majority, I tend to let it ride for a little while in the expectation that I'm going to see multiple bug reports anyway - as we know, nobody tests these packages until they are released, so we always see a bunch of quick bug reports after release. Just my $0.00002 🤷♂️ |
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com> (cherry picked from commit fcf7e16)
Increase CI coverage to prevent open-mpi#12693 Signed-off-by: Wenduo Wang <wenduwan@amazon.com> (cherry picked from commit fcf7e16)
needed quick turnaround owing to open-mpi/ompi#12693 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
* Open MPI: add release 5.0.4 * OpenMPI: add release 5.0.5 needed quick turnaround owing to open-mpi/ompi#12693 --------- Signed-off-by: Howard Pritchard <howardp@lanl.gov>
* Open MPI: add release 5.0.4 * OpenMPI: add release 5.0.5 needed quick turnaround owing to open-mpi/ompi#12693 --------- Signed-off-by: Howard Pritchard <howardp@lanl.gov>
* Open MPI: add release 5.0.4 * OpenMPI: add release 5.0.5 needed quick turnaround owing to open-mpi/ompi#12693 --------- Signed-off-by: Howard Pritchard <howardp@lanl.gov>
* Open MPI: add release 5.0.4 * OpenMPI: add release 5.0.5 needed quick turnaround owing to open-mpi/ompi#12693 --------- Signed-off-by: Howard Pritchard <howardp@lanl.gov>
* Open MPI: add release 5.0.4 * OpenMPI: add release 5.0.5 needed quick turnaround owing to open-mpi/ompi#12693 --------- Signed-off-by: Howard Pritchard <howardp@lanl.gov>
* Open MPI: add release 5.0.4 * OpenMPI: add release 5.0.5 needed quick turnaround owing to open-mpi/ompi#12693 --------- Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
5.0.4
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
https://github.com/mpi4py/mpi-publish/blob/master/cibw-build-mpi.sh
Please describe the system on which you are running
Details of the problem
Full build logs: https://github.com/mpi4py/mpi-publish/actions/runs/9996695568/job/27631626010
The text was updated successfully, but these errors were encountered: