Slow MPI_Group_difference #12286

k202077 · 2024-01-29T12:29:54Z

In a setup (using OpenMPI 4.1.3) with >14,000 processes, we noticed an unusually long initialization time. While investigating this, we found out that ~60 consecutive calls to MPI_Group_difference involving a group, which contained all processes of the run, took several minutes. I suspect that the implementation of ompi_group_dense_overlap (used by MPI_Group_difference) is sub optimal for such cases, because it seems to use an algorithm with a time complexity of O(n²) .

We could replicate a similar functionality using a collective MPI_Allreduce, which was many times faster, even though MPI_Group_difference is a local operation.

A more sophisticated algorithm (by for example by using sorted lists of the processes of each group) should be able to improve the performance significantly.

The text was updated successfully, but these errors were encountered:

jsquyres · 2024-01-30T16:25:48Z

This is a request for a performance improvement of MPI_Group_difference(). It is unlikely that we'll take such an improvement back on the v4.1.x series -- that series is (slowly) being retired in favor of the v5.0.x series. I.e., we're still actively taking bug fixes, but not necessarily new features / overhauls of existing algorithms.

jsquyres added feature request Target: v4.1.x Target: main Target: v5.0.x and removed Target: v4.1.x labels Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow MPI_Group_difference #12286

Slow MPI_Group_difference #12286

k202077 commented Jan 29, 2024

jsquyres commented Jan 30, 2024

Slow MPI_Group_difference #12286

Slow MPI_Group_difference #12286

Comments

k202077 commented Jan 29, 2024

jsquyres commented Jan 30, 2024