-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fortran+C configure test fails when LTO LDFLAGS are specified #12674
Comments
We had a lengthy conversation about this over in openpmix/openpmix#3350. The result of the conversation was that we went back and checked the MCA base code and validated that it is doing what we expect it to do, and we are convinced that the code is (still) correct. There may be some disagreement on that point from the OP, but we do not have the bandwidth to make the substantial changes that would be required to make LTO compilers be able to grok our tree; we're also not convinced that LTO would noticeably improve Open MPI's performance (especially since 1) the OMPI code base is already optimized down to single-digit microsecond -- and sometimes even just hundreds of nanoseconds -- overheads in critical code path, and also 2) since much of Open MPI's core functionality is invoked via function pointers that are dynamically determined at run time, there's not too much that an LTO could do, anyway). That being said, if Open MPI's code base ever breaks because a compiler determines that our code is wrong in this area, we give the OP the right to say "I told you so!" 😉 The conversation was fruitful and it was an excellent exercise to go re-validate that (we believe, at least, that) our code was written and working as intended. PMIx ended up adding a configure-time check to see if LTO flags were enabled, and if so, abort (since that will simply result in a link failure later). We might do the same here in OMPI. Thanks, everyone, for the conversation. |
Yup -- my primary concern is actually just that people will have it enabled system-wide and then ompi gets issues. That being said -- in this specific case it's actually a configure time error because of the configure probes to check for a working Fortran compiler, so getting those configure probes working shouldn't require a redesign of MCA or the ompi function pointer design. (I think it's just an issue in how the wrappers pass information between Fortran and C. Basically, fortran is claiming it takes void.) Maybe it's not worth it if you aren't going to make the entire codebase support LTO anyway, but... ... if you're going to add a configure check and abort early, make sure to add it somewhere early in configure.ac, at least before checking for a Fortran compiler. :D Because the most confusing part of this is actually that it errored out and said "nope, sorry, your fortran compiler is broken and cannot compile code". |
Fair enough. I did actually start to investigate the Fortran compiler configure test failure yesterday and was able to replicate the issue. I ran out of time before figuring out the root cause (I'm not a Fortran expert). I'll keep poking at this, but to be honest: probably with only medium-level priority. Suggestions for how to fix the test would be appreciated, if anyone else knows more about Fortran+C+LTO issues. |
I took a stab at this, and after spending few hours I reached the conclusion that this is a lot of grunt work, for very little benefits, other than being able to build with LTO support. The problem is that with LTO enabled the Fortran compiler will not simply accept and EXTERNAL function or subroutine, it needs the exact prototype. In general that could have been relatively easy to handle, until one starts messing around with array of CHARACTER when things get messy or the LTO checker decides that there is a mismatch between the Fortran and C types. I started to play around with the ISO_C types, but doing this defeat the original purpose of our configure checking. If someone is interested in continuing this, I attached the sketch of what need to be done to all Fortran checks. Good luck ! diff --git a/config/ompi_fortran_get_sizeof.m4 b/config/ompi_fortran_get_sizeof.m4
index e25d982c58..b0229f9740 100644
--- a/config/ompi_fortran_get_sizeof.m4
+++ b/config/ompi_fortran_get_sizeof.m4
@@ -32,7 +32,13 @@ AC_DEFUN([OMPI_FORTRAN_GET_SIZEOF],[
cat > conftestf.f90 <<EOF
program fsize
$1
- external size
+ interface
+ subroutine size(x, y)
+ $2 :: x
+ $2 :: y
+ end subroutine size
+ end interface
+
$2 :: x(2)
call size(x(1),x(2))
end program
@@ -52,6 +58,7 @@ $ompi_conftest_h
#ifdef __cplusplus
extern "C" {
#endif
+void $ompi_ac_size_fn(char *a, char *b);
void $ompi_ac_size_fn(char *a, char *b)
{
int diff = (int) (b - a);
diff --git a/config/opal_lang_link_with_c.m4 b/config/opal_lang_link_with_c.m4
index 496081f4b0..8242a1a69e 100644
--- a/config/opal_lang_link_with_c.m4
+++ b/config/opal_lang_link_with_c.m4
@@ -40,8 +40,18 @@ EOF
LIBS="conftest_c.o $LIBS"
m4_if(ompi_lang_link_with_c_fortran, 1,
[AC_LINK_IFELSE([AC_LANG_PROGRAM([], [
- external testfunc
- call testfunc(1)
+ interface
+ function testfunc(n) result(r)
+ use, intrinsic :: iso_c_binding, only: c_int, c_char
+ implicit none
+ integer(c_int) :: r
+ integer(c_int), value :: n
+ end function testfunc
+ end interface
+
+ integer outcome
+ outcome = testfunc(1)
+
])],
[AS_VAR_SET(lang_var, ["yes"])], [AS_VAR_SET(lang_var, ["no"])])],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([ |
FWIW: I added configure logic to both PMIx and PRRTE to detect that LTO had been requested and error out due to incompatibility. So even if you got this to work, OMPI will still fail to build when it hits either of those packages. I'd suggest following the last advice and just add the "detect LTO and error out" logic to occur before the Fortran check in OMPI so we don't even attempt this configury. |
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
5.0.3 tarball
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Using Gentoo Portage (I am currently working to upgrade Gentoo from openmpi 4.1.6 to 5.0.3).
Details of the problem
I tried to build with the following *FLAGS to optimize the build:
-flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing
Note the -Werror=* flags are used to help detect cases where the compiler tries to optimize by assuming UB cannot exist in the source code -- if it does exist, ordinarily the code would be miscompiled, and this says to make the miscompilation a fatal error.
I wasn't able to successfully finish ./configure
I got this error:
Here is the relevant snippet from config.log:
Full build log: openmpi-5.0.3:20240711-181726.log
Contents of autotools' config.log: config.log
The text was updated successfully, but these errors were encountered: