-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
with-pmix doesn't ensure all is available #1655
Comments
Wish they just used pkg-config. Ah well, I'll look into fixing up our |
Me too, was the first thing I tried. I think this is all fallout from the recent switch to packaging OpenMPI, pmix and slurm so they can be installed together. Hopefully they’ll add one at some point.
On September 12, 2018 at 5:50:52 PM GMT+1, Mark Grondona <notifications@github.com> wrote:
Wish they just used pkg-config. Ah well, I'll look into fixing up our x_ac_pmix.m4
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#1655 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAoStU3nU64RE4Akpdx7xZS1U2K9QjFAks5uaTtRgaJpZM4WlWFu>.
|
Is it ok that the libpmix package installs the headers into a multiarch location?
That seems wrong -- is there some reason pmix.h/pmi.h are architecture specific? |
That sounds fine to me. Minimizing effort here would be good since this only exists in flux to work around IBM's packaging issues. It adds little of value otherwise since the almost identical code is in pmix's compatibility library and we use that if it's there, which I assume it is everywhere but on this spectrum MPI system. |
We at least tried. Oh well. 😞 |
It really isn’t just IBM anymore. We can’t launch a standard openmpi on Debian, Ubuntu or rhel anymore with the stock packages unless we build with pmix. I’m ok with building pmix if that’s best for testing, I’ll even add it to the image it’s no problem, but I think it would make life substantially easier if the —with-pmix option could take a prefix path to where to look for pmix. It is possible to specify otherwise but requires a link directory, include directory and rpath.
On September 12, 2018 at 7:42:47 PM GMT+1, Dong H. Ahn <notifications@github.com> wrote:
It adds little of value otherwise since the almost identical code is in pmix's compatibility library and we use that if it's there, which I assume it is everywhere but on this spectrum MPI system.
We at least tried. Oh well. 😞
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#1655 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAoStVyY2QskbbN1sjcR6k8rC5apxVquks5uaVV-gaJpZM4WlWFu>.
|
I'm confused, are you saying that openmpi's mpirun can't launch flux if flux is not built Since we don't provide pmix (even |
I got myself confused somehow. Without with-pmix, they can’t launch us correct. As a side-effect of getting with-pmix going you also end up with libpmi in your library search path, which somehow seems to solve the other issue? Without adding that flag and the extra search path items I have been getting errors attempting to launch an openmpi program with wreckrun, with them it goes away. I’m thinking there is a very good chance my jet lagged brain jumped to a wrong conclusion as to the causal relationship, but there is an issue there.
On September 12, 2018 at 10:46:13 PM GMT+1, Jim Garlick <notifications@github.com> wrote:
I'm confused, are you saying that openmpi's mpirun can't launch flux if flux is not built --with-pmix?
Since we don't provide pmix (even --with-pmix), does that mean we can no longer launch openmpi programs at all?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#1655 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAoStWbInfYm58USl-CyUnCnRvOB0xEAks5uaYCVgaJpZM4WlWFu>.
|
Maybe, orterun and/or mpirun don't add the path to the backward compatibility libpmi to LD_LIBRARY_PATH for the program (i.e., flux) being launched, while adding the libpmix path. If this is the case, isn't this an issue in their packaging? |
Do we need a separate issue to track the problem where orte can't launch flux that wasn't built with |
To match most of the other flux dependencies, I'll add a couple e.g.
|
It’s actually the libpmi1-pmix package we need I think, but yeah the
packaging is odd. There’s a Debian issue on it that references back
to [this pmix issue](open-mpi/ompi#4072) where
they discuss packaging pmix separately from OpenMPI on Debian, ubuntu
and fedora (thus eventually reel) rather than in-tree.
…On 12 Sep 2018, at 16:05, Mark Grondona wrote:
Do we need a separate issue to track the problem where orte can't
launch flux that wasn't built with `--with-pmix`? Looking at the
OpenMPI/pmix packaging on Ubuntu, I would not *at all* be surprised if
it boiled down to a packaging problem.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#1655 (comment)
|
Ok, I clearly messed up some assumptions yesterday. The with-pmix issue causes problems building flux and subsequently running it with the openmpi launcher, usual thing, but it amounts to just wanting to be able to pass the information to configure more cleanly. The MPI issue I was seeing was something else. The older openmpi packaged with ubuntu bionic ships without the flux module for no apparent good reason, on Debian sid the flux module is there but immediately segfaults. I'll open a separate issue for the segfault issue, but I think the solve for this is going to be using mpich in the test environment so we don't have to deal with this pmix foolishness. |
Closing since we nixed pmix client in #1662 |
The current check for pmix checks for the library, but doesn't check for the headers. It also doesn't take a path to allow pmix to be outside of the standard search path, as is the norm for some distributions. This is currently causing some of the test issues in #1652.
The text was updated successfully, but these errors were encountered: