Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sessions: fix a problem with fortran comm handles #14

Open
wants to merge 9 commits into
base: sessions_new
Choose a base branch
from

Conversation

hppritcha
Copy link

@hppritcha hppritcha commented May 11, 2019

Sessions related changes changed the order of initialization
of pre-defined communicators used in the World Process Model.

This led to issues for MPI Fortran applications since the predefined
handles for MPI_COMM_WORLD, MPI_COMM_SELF, and MPI_COMM_NULL were
wrong, leading to a meltdown with any call to MPI using communicators
from the Fortran interfaces.

Further, when using sessions, there's no guarantee when the app might
finally call MPI_Init, so the original algorithm for adding entries
to the comm struct f_to_c pointer array no longer works as-is.

This commit fixes these issues with Fortran interfaces and MPI
communicator handles.

Signed-off-by: Howard Pritchard howardp@lanl.gov

Sessions related changes changed the order of initialization
of pre-defined communicators used in the World Process Model.

This led to issues for MPI Fortran applications since the predefined
handles for MPI_COMM_WORLD, MPI_COMM_SELF, and MPI_COMM_NULL were
wrong, leading to a meltdown with any call to MPI using communicators
from the Fortran interfaces.

Further, when using sessions, there's no guarantee when the app might
finally cause MPI_Init, so the original algorithm for adding entries
to the comm struct f_to_c pointer array no longer works as-is.

This commit fixes these issues with Fortran interfaces and MPI
communicator handles.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
@hppritcha
Copy link
Author

@hjelmn this fixes a problem Sam was hitting with the app since it uses fortran 😄

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
@hppritcha
Copy link
Author

@hjelmn could you merge this in when you have a moment?

hppritcha and others added 7 commits May 25, 2019 10:34
so that the code can compile when configured with

--disable-c11-atomics

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
fix fortran problems with sessions
the way fboxes works has issues for the sessions implementation,
in particular tthe session finalize approach.

what happens without this temporary fix is that if there is not some fully shcnronizing call
prior to calling session_finalize, there are cases where a process may be probing its fast
mailboxes for processes that are tearing down theses fboxes.  That results in segfauls and
sigbus problems.

The fast box mechanism will need to be supplemented with some kind of shutdown mechanism
that will tell the owner of the fboxes when its okay to actually tear them down.

IN the interest of making progress using the sessions prototype with applications, shut
down the fbox process for the prototype and return to coming up with a real fix at a later
date.

relates to #3

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
temporarily disable fboxes for sessions
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants