diff --git a/docs/news/index.rst b/docs/news/index.rst index 9a03873c44d..6e377a4f6fd 100644 --- a/docs/news/index.rst +++ b/docs/news/index.rst @@ -39,6 +39,7 @@ version 1.0. :maxdepth: 1 news-main + news-v5.0.x news-v4.1.x news-v4.0.x news-v3.1.x diff --git a/docs/news/news-main.rst b/docs/news/news-main.rst index c00923fa767..5df955c5284 100644 --- a/docs/news/news-main.rst +++ b/docs/news/news-main.rst @@ -6,9 +6,9 @@ not yet appeared on a release branch. It reflects active development, and is therefore a "loose" listing of features and changes. It is not considered definitive. -Open MPI version 5.0.0rc2 -------------------------- -:Date: 10 Oct 2021 +Open MPI version main +--------------------- +:Date: 30 March 2022 .. admonition:: MPIR API has been removed :class: warning @@ -30,180 +30,3 @@ Open MPI version 5.0.0rc2 This may result in shorter-than-normal startup times and smaller memory footprints. It is recommended to install zlib and zlib-devel for a better user experience. - -- ORTE, the underlying OMPI launcher has been removed, and replaced - with PRTE. -- Reworked how Open MPI integrates with 3rd party packages. - The decision was made to stop building 3rd-party packages - such as Libevent, HWLOC, PMIx, and PRRTE as MCA components - and instead 1) start relying on external libraries whenever - possible and 2) Open MPI builds the 3rd party libraries (if needed) - as independent libraries, rather than linked into libopen-pal. -- Update to use PMIx v4.1.1rc2 -- Update to use PRRTE v2.0.1rc2 -- Change the default component build behavior to prefer building - components as part of libmpi.so instead of individual DSOs. -- Remove pml/yalla, mxm, mtl/psm, and ikrit components. -- Remove all vestiges of the C/R support. -- Various ROMIO v3.4.1 updates. -- Use Pandoc to generate manpages -- 32 bit atomics are now only supported via C11 compliant compilers. -- Explicitly disable support for GNU gcc < v4.8.1 (note: the default - gcc compiler that is included in RHEL 7 is v4.8.5). -- Do not build Open SHMEM layer when there are no SPMLs available. - Currently, this means the Open SHMEM layer will only build if - the UCX library is found. -- Fix rank-by algorithms to properly rank by object and span. -- Updated the ``-mca pml`` option to only accept one pml, not a list. -- vprotocol/pessimist: Updated to support ``MPI_THREAD_MULLTIPLE``. -- btl/tcp: Updated to use reachability and graph solving for global - interface matching. This has been shown to improve ``MPI_Init()`` - performance under btl/tcp. -- fs/ime: Fixed compilation errors due to missing header inclusion - Thanks to Sylvain Didelot for finding - and fixing this issue. -- Fixed bug where MPI_Init_thread can give wrong error messages by - delaying error reporting until all infrastructure is running. -- Atomics support removed: S390/s390x, Sparc v9, ARMv4 and ARMv5 CMA - support. -- ``autogen.pl`` now supports a ``-j`` option to run multi-threaded. - Users can also use environment variable ``AUTOMAKE_JOBS``. -- PMI support has been removed for Open MPI apps. -- Legacy btl/sm has been removed, and replaced with btl/vader, which - was renamed to btl/sm. -- Update btl/sm to not use CMA in user namespaces. -- C++ bindings have been removed. -- The ``--am`` and ``--amca`` options have been deprecated. -- opal/mca/threads framework added. Currently supports - argobots, qthreads, and pthreads. See the --with-threads=x option - in configure. -- Various ``README.md`` fixes - thanks to: - Yixin Zhang , - Samuel Cho , - Robert Langfield , - Alex Ross , - Sophia Fang , - mitchelltopaloglu , - Evstrife , and - Hao Tong for their - contributions. -- osc/pt2pt: Removed. Users can use osc/rdma + btl/tcp - for OSC support using TCP, or other providers. -- Open MPI now links -levent_core instead of -levent. -- MPI-4: Added ``ERRORS_ABORT`` infrastructure. -- common/cuda docs: Various fixes. Thanks to - Simon Byrne for finding and fixing. -- osc/ucx: Add support for acc_single_intrinsic. -- Fixed ``buildrpm.sh -r`` option used for RPM options specification. - Thanks to John K. McIver III for - reporting and fixing. -- configure: Added support for setting the wrapper C compiler. - Adds new option ``--with-wrapper-cc=``. -- mpi_f08: Fixed Fortran-8-byte-INTEGER vs. C-4-byte-int issue. - Thanks to @ahaichen for reporting the bug. -- MPI-4: Added support for 'initial error handler'. -- opal/thread/tsd: Added thread-specific-data (tsd) api. -- MPI-4: Added error handling for 'unbound' errors to ``MPI_COMM_SELF``. -- Add missing ``MPI_Status`` conversion subroutines: - ``MPI_Status_c2f08()``, ``MPI_Status_f082c()``, ``MPI_Status_f082f()``, - ``MPI_Status_f2f08()`` and the ``PMPI_*`` related subroutines. -- patcher: Removed the Linux component. -- opal/util: Fixed typo in error string. Thanks to - NARIBAYASHI Akira for finding - and fixing the bug. -- fortran/use-mpi-f08: Generate PMPI bindings from the MPI bindings. -- Converted man pages to markdown. - Thanks to Fangcong Yin for their contribution - to this effort. -- Fixed ompi_proc_world error string and some comments in pml/ob1. - Thanks to Julien EMMANUEL for - finding and fixing these issues. -- oshmem/tools/oshmem_info: Fixed Fortran keyword issue when - compiling param.c. Thanks to Pak Lui for - finding and fixing the bug. -- autogen.pl: Patched libtool.m4 for OSX Big Sur. Thanks to - @fxcoudert for reporting the issue. -- Updgraded to HWLOC v2.4.0. -- Removed config/opal_check_pmi.m4. - Thanks to Zach Osman for the contribution. -- opal/atomics: Added load-linked, store-conditional atomics for - AArch6. -- Fixed envvar names to OMPI_MCA_orte_precondition_transports. - Thanks to Marisa Roman - for the contribution. -- fcoll/two_phase: Removed the component. All scenerios it was - used for has been replaced. -- btl/uct: Bumped UCX allowed version to v1.9.x. -- ULFM Fault Tolerance has been added. See ``README.FT.ULFM.md``. -- Fixed a crash during CUDA initialization. - Thanks to Yaz Saito for finding - and fixing the bug. -- Added CUDA support to the OFI MTL. -- ompio: Added atomicity support. -- Singleton comm spawn support has been fixed. -- Autoconf v2.7 support has been updated. -- fortran: Added check for ``ISO_FORTRAN_ENV:REAL16``. Thanks to - Jeff Hammond for reporting this issue. -- Changed the MCA component build style default to static. -- PowerPC atomics: Force usage of opal/ppc assembly. -- Removed C++ compiler requirement to build Open MPI. -- Fixed .la files leaking into wrapper compilers. -- Fixed bug where the cache line size was not set soon enough in - ``MPI_Init()``. -- coll/ucc and scoll/ucc components were added. -- coll/ucc: Added support for allgather and reduce collective - operations. -- autogen.pl: Fixed bug where it would not ignore all - excluded components. -- Various datatype bugfixes and performance improvements -- Various pack/unpack bugfixes and performance improvements -- Fix mmap infinite recurse in memory patcher -- Fix C to Fortran error code conversions. -- osc/ucx: Fix data corruption with non-contiguous accumulates -- Update coll/tuned selection rules -- Fix non-blocking collective ops -- btl/portals4: Fix flow control -- Various oshmem:ucx bugfixes and performance improvements -- common/ofi: Disable new monitor API until libfabric 1.14.0 -- Fix AVX detection with icc -- mpirun option ``--mca ompi_display_comm mpi_init``/``mpi_finalize`` - has been added. Enables a communication protocol report: - when ``MPI_Init`` is invoked (using the ``mpi_init`` value) and/or - when ``MPI_Finalize`` is invoked (using the ``mpi_finalize`` value). -- New algorithm for Allgather and Allgatherv added, based on the - paper *"Sparbit: a new logarithmic-cost and data locality-aware MPI - Allgather algorithm"*. Default algorithm selection rules are - un-changed, to use these algorithms add: - ``--mca coll_tuned_allgather_algorithm sparbit`` and/or - ``--mca coll_tuned_allgatherv_algorithm sparbit`` - Thanks to: Wilton Jaciel Loch , - and Guilherme Koslovski for their contribution. -- MPI-4: Persistent collectives have been moved to the MPI - namespace from MPIX. -- OFI: Delay patcher initialization until needed. It will now - be initialized only after the component is officially selected. -- MPI-4: Make ``MPI_Comm_get_info``, ``MPI_File_get_info``, and - ``MPI_Win_get_info`` compliant to the standard. -- Portable_platform file has been updated from GASNet. -- GCC versions < 4.8.1 are no longer supported. -- coll: Fix a bug with the libnbc ``MPI_AllReduce`` ring algorithm - when using ``MPI_IN_PLACE``. -- Updated the usage of .gitmodules to use relative paths from - absolute paths. This allows the submodule cloning to use the same - protocol as OMPI cloning. Thanks to Felix Uhl - for the contribution. -- osc/rdma: Add local leader pid in shm file name to make it unique. -- ofi: Fix memory handler unregistration. This change fixes a - segfault during shutdown if the common/ofi component was built - as a dynamic object. -- osc/rdma: Add support for MPI minimum alignment key. -- memory_patcher: Add ability to detect patched memory. Thanks - to Rich Welch for the contribution. -- build: Improve handling of compiler version string. This - fixes a compiler error with clang and armclang. -- Fix bug where the relocation of OMPI packages caused - the launch to fail. -- Various improvements to ``MPI_AlltoAll`` algorithms for both - performance and memory usage. -- coll/basic: Fix segmentation fault in ``MPI_Alltoallw`` with - ``MPI_IN_PLACE``. diff --git a/docs/news/news-v5.0.x.rst b/docs/news/news-v5.0.x.rst new file mode 100644 index 00000000000..4b62f150bf9 --- /dev/null +++ b/docs/news/news-v5.0.x.rst @@ -0,0 +1,213 @@ +Open MPI v5.0.x series +====================== + +This file contains all the NEWS updates for the Open MPI v5.0.x +series, in reverse chronological order. + +Open MPI version 5.0.0rc4 +------------------------- +:Date: 30 March 2022 + +.. admonition:: MPIR API has been removed + :class: warning + + As was announced in summer 2017, Open MPI has removed support of + MPIR-based tools beginning with the release of Open MPI v5.0.0. + + The new PRRTE based runtime environment supports PMIx-tools API + instead of the legacy MPIR API for debugging parallel jobs. + + See https://github.com/openpmix/mpir-to-pmix-guide for more + information. + +.. admonition:: zlib is suggested for better user experience + :class: note + + PMIx will optionally use zlib to compress large data streams. + This may result in faster startup times and + smaller memory footprints (compared to not using compression). + The Open MPI community recommends building zlib support with PMIx, + regardless of whether you are using an externally-installed PMIx or + the PMIx that is installed with Open MPI. + +- Updated to use PMIx ``v4.2`` branch - current hash: ``1b86a35``. +- Updated to use PRRTE ``v2.1`` branch - current hash: ``91f791e``. + +.. caution:: + Open MPI no longer builds 3rd-party packages + such as Libevent, HWLOC, PMIx, and PRRTE as MCA components + and instead: + + #. Relies on external libraries whenever possible, and + #. Builds the 3rd party libraries only if needed, and as independent + libraries, rather than linked into the Open MPI core libraries. + +- New Features: + + - ULFM Fault Tolerance support has been added. See :ref:`the ULFM section ` + - ``CUDA`` is now supported in the ``ofi`` MTL. + - mpirun option ``--mca ompi_display_comm mpi_init``/``mpi_finalize`` + has been added. This enables a communication protocol report: + when ``MPI_Init`` is invoked (using the ``mpi_init`` value) and/or + when ``MPI_Finalize`` is invoked (using the ``mpi_finalize`` value). + - The threading framework has been added to allow building OMPI with different + threading libraries. It currently supports Argobots, Qthreads, and Pthreads. + See the ``--with-threads`` option in the ``configure`` command. + Thanks to Shintaro Iwasaki and Jan Ciesko for their contributions to + this effort. + - New Thread Local Storage API: Removes global visibility of TLS structures + and allows for dynamic TLS handling. + - Added load-linked, store-conditional atomics support for AArch64. + - Added atomicity support to the ``ompio`` component. + - Added support for MPI minimum alignment key to the one-sided ``RDMA`` component. + - Add ability to detect patched memory to ``memory_patcher``. Thanks + to Rich Welch for the contribution. + +- MPI-4.0 updates and additions: + + - Support for ``MPI Sesisons`` has been added. + - Added partitioned communication using persistent sends + and persistent receives. + - Added persistent collectives to the ``MPI_`` namespace + (they were previously available via the ``MPIX_`` prefix). + - Added ``MPI_Isendrecv()`` and its variants. + - Added support for ``MPI_Comm_idup_with_info()``. + - Added support for ``MPI_Info_get_string()``. + - Added support for ``initial_error_handler`` and the ``ERRORS_ABORT`` infrastructure. + - Added error handling for "unbound" errors to ``MPI_COMM_SELF``. + - Made ``MPI_Comm_get_info()``, ``MPI_File_get_info()``, and + ``MPI_Win_get_info()`` compliant to the standard. + - Droped unknown/ignored info keys on communicators, files, + and windows. + +- Transport updates and improvements + + - One-sided Communication: + + - Many MPI one-sided and RDMA emulation fixes for the ``tcp`` BTL. + + - This patch series fixs many issues when running with + ``--mca osc rdma --mca btl tcp``, IE - TCP support for one sided + MPI calls. + - Many MPI one-sided fixes for the ``ucx`` BTL. + - Added support for ``acc_single_intrinsic`` to the one-sided ``ucx`` component. + - Removed the legacy ``pt2pt`` one-sided component. Users should use + the ``rdma`` one-sided component instead with the ``tcp`` BTL and/or other BTLs + to use MPI one sided-calls via TCP transport. + + - Updated the ``tcp`` BTL to use graph solving for global + interface matching between peers in order to improve ``MPI_Init()`` wireup + performance. + + - Shared Memory: + + - The legacy ``sm`` (shared memory) BTL has been removed. + The next-generation shared memory BTL ``vader`` replaces it, + and has been renamed to be ``sm`` (``vader`` will still work as an alias). + - Update the new ``sm`` BTL to not use Linux Cross Memory Attach (CMA) in user namespaces. + - Fixed a crash when using the new ``sm`` BTL when compiled with Linux Cross Memory Attach (``XPMEM``). + Thanks to George Katevenis for reporting this issue. + + - Updated the ``-mca pml`` option to only accept one pml, not a list. +- Deprecations and removals: + + - ORTE, the underlying OMPI launcher has been removed, and replaced + with The PMIx Reference RunTime Environment (``PRTE``). + - PMI support has been removed from Open MPI; now only PMIx is supported. + Thanks to Zach Osman for removing config/opal_check_pmi.m4. + - Removed transports PML ``yalla``, ``mxm``, MTL ``psm``, and ``ikrit`` components. + These transports are no longer supported, and are replaced with ``UCX``. + - Removed all vestiges of Checkpoint Restart (C/R) support. + - 32 bit atomics are now only supported via C11 compliant compilers. + - Explicitly disable support for GNU gcc < v4.8.1 (note: the default + gcc compiler that is included in RHEL 7 is v4.8.5). + - Various atomics support removed: S390/s390x, Sparc v9, ARMv4 and ARMv5 with CMA + support. + - The MPI C++ bindings have been removed. + - The mpirun options ``--am`` and ``--amca`` options have been deprecated. + - ompi/contrib: Removed ``libompitrace``. + This library was incomplete and unmaintained. If needed, it + is available in the v4/v4.1 series. +- HWLOC updates: + + - Open MPI now requires HWLOC v1.11.0 or later. + - The internal HWLOC shipped with OMPI has been updated to v2.7.0. + - Enable --enable-plugins when appropriate. +- Documentation updates and improvements: + + - Open MPI now uses readthedocs.io for all documentation. + - Converted man pages to markdown. Thanks to Fangcong Yin for their contribution + to this effort. + - Various ``README.md`` fixes - thanks to: Yixin Zhang, Samuel Cho, + Robert Langfield, Alex Ross, Sophia Fang, mitchelltopaloglu, Evstrife, + and Hao Tong for their contributions. + - Various CUDA documentation fixes. Thanks to Simon Byrne for finding + and fixing these typos. +- Build updates and fixes: + + - Change the default component build behavior to prefer building + components as part of the core Open MPI library instead of individual DSOs. + Currently, this means the Open SHMEM layer will only build if + the UCX library is found. + - ``autogen.pl`` now supports a ``-j`` option to run multi-threaded. + Users can also use the environment variable ``AUTOMAKE_JOBS``. + - Updated ``autogen.pl`` to support macOS Big Sur. Thanks to + @fxcoudert for reporting the issue. + - Fixed bug where ``autogen.pl`` would not ignore all + excluded components when using the ``--exclude`` option. + - Fixed a bug the ``-r`` option of ``buildrpm.sh`` which would result + in an rpm build failure. Thanks to John K. McIver III for reporting and fixing. + - Removed the ``C++`` compiler requirement to build Open MPI. + - Updates to improve the handling of the compiler version string in the build system. + This fixes a compiler error with clang and armclang. + - Added OpenPMIx binaries to the build, including ``pmix_info``. + Thanks to Mamzi Bayatpour for their contribution to this effort. + - Open MPI now links to Libevent using ``-levent_core`` + and ``-levent_pthread`` instead of ``-levent``. + - Added support for setting the wrapper C compiler. + This adds a new option: ``--with-wrapper-cc=`` to the ``configure`` command. + - Fixed compilation errors when running on IME file systems + due to a missing header inclusion. Thanks to Sylvain Didelot for finding + and fixing this issue. + - Add support for GNU Autoconf v2.7.x. +- Other updates and bug fixes: + + - Updated Open MPI to use ``ROMIO`` v3.4.1. + - Fixed Fortran-8-byte-INTEGER vs. C-4-byte-int issue in the ``mpi_f08`` + MPI Fortran bindings module. Thanks to @ahaichen for reporting the bug. + - Add missing ``MPI_Status`` conversion subroutines: + ``MPI_Status_c2f08()``, ``MPI_Status_f082c()``, ``MPI_Status_f082f()``, + ``MPI_Status_f2f08()`` and the ``PMPI_*`` related subroutines. + - Fixed Fortran keyword issue when compiling ``oshmem_info``. + Thanks to Pak Lui for finding and fixing the bug. + - Added check for Fortran ``ISO_FORTRAN_ENV:REAL16``. Thanks to + Jeff Hammond for reporting this issue. + - Fixed Fortran preprocessor issue with CPPFLAGS. + Thanks to Jeff Hammond for reporting this issue. + - MPI module: added the mpi_f08 TYPE(MPI_*) types for Fortran. + Thanks to George Katevenis for the report and their contribution to the patch. + - Fixed a typo in an error string when showing the stackframe. Thanks to + Naribayashi Akira for finding and fixing the bug. + - Fixed output error strings and some comments in the Open MPI code base. + Thanks to Julien Emmanuel for finding and fixing these issues. + - The ``uct`` BTL transport now supports ``UCX`` v1.9 and higher. + There is no longer a maximum supported version. + - Updated the UCT BTL defaults to allow Mellanox HCAs + (``mlx4_0``, and ``mlx5_0``) for compatibility with the one-sided ``rdma`` component. + - Fixed a crash during CUDA initialization. + Thanks to Yaz Saito for finding and fixing the bug. + - Singleton ``MPI_Comm_spawn()`` support has been fixed. + - PowerPC atomics: Force usage of ppc assembly by default. + - Various datatype bugfixes and performance improvements. + - Various pack/unpack bugfixes and performance improvements. + - Various OSHMEM bugfixes and performance improvements. + - New algorithm for Allgather and Allgatherv has been added, based on the + paper *"Sparbit: a new logarithmic-cost and data locality-aware MPI + Allgather algorithm"*. Default algorithm selection rules are + un-changed, to use these algorithms add: + ``--mca coll_tuned_allgather_algorithm sparbit`` and/or + ``--mca coll_tuned_allgatherv_algorithm sparbit`` to your ``mpirun`` command. + Thanks to: Wilton Jaciel Loch, and Guilherme Koslovski for their contribution. + - Updated the usage of .gitmodules to use relative paths from + absolute paths. This allows the submodule cloning to use the same + protocol as OMPI cloning. Thanks to Felix Uhl for the contribution.