Skip to content

Commit

Permalink
Split mesh and mesh topology MPI communicators (#2925)
Browse files Browse the repository at this point in the history
* Improve test name

* Refactor hex cube

* Work on hex mesh

* Small change

* Remove stray function

* Small fix

* Update mesh creation

* Tests

* Remove comments

* Disable test

* Re-enable tests

* Improvements

* Tidy up

* ifdef fix

* Tidy up

* Write test mesh in serial

* Refactoring

* Formatting fix

* Test update

* Reverse test order

* Simplify

* Simplifications

* Simplify

* Update prism mesh

* Update

* Remove ParMETIS hacks

* Improve docs

* Improve docs

* Re-factor topology computation

* Bug fix

* Improve docs

* Disable test with Kahip

* Refactor

* Doc fix

* Fix

* Debig oneapi ci

* Run mpi tests only

* Use PTSCOTCH in oneAPI CI

* Re-enable oneAPI tests

* Relax tolerance

* Simlifications

* Revert some changes

* Simplification

* Doc fix

* Small update

* Simplification

* Simplify test

* Tidy up

* Simplifications

* Work on create_geometry

* More simplifications

* Add template deduction helpers

* Simplification

* Small change

* Add assert

* Work on geometry subcomm

* Comment improvement

* Fix typo

* Split MPI comm for ParMETIS since it crashes with empty data.

Splitting could be made an option in the future.

* Small simplifications
  • Loading branch information
garth-wells authored Jan 3, 2024
1 parent d89545a commit 6dac2f6
Show file tree
Hide file tree
Showing 25 changed files with 1,013 additions and 961 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/oneapi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ jobs:
- name: Configure DOLFINx C++
run: |
. /opt/intel/oneapi/setvars.sh
cmake -G Ninja -DCMAKE_BUILD_TYPE=Developer -DDOLFINX_ENABLE_SCOTCH=off -DDOLFINX_ENABLE_KAHIP=on -DDOLFINX_UFCX_PYTHON=off -B build -S cpp/
cmake -G Ninja -DCMAKE_BUILD_TYPE=Developer -DDOLFINX_ENABLE_SCOTCH=on -DDOLFINX_ENABLE_KAHIP=on -DDOLFINX_UFCX_PYTHON=off -B build -S cpp/
- name: Build and install DOLFINx C++ library
run: |
Expand Down Expand Up @@ -107,4 +107,4 @@ jobs:
- name: Run DOLFINx Python unit tests (MPI, np=2)
run: |
. /opt/intel/oneapi/setvars.sh
mpiexec -n 2 pytest -vs python/test/unit
mpiexec -n 2 pytest python/test/unit
4 changes: 2 additions & 2 deletions cpp/demo/mixed_topology/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,8 @@ int main(int argc, char* argv[])
std::cout << "]\n";
}

auto geom = mesh::create_geometry(MPI_COMM_WORLD, *topo, elements,
cells_list, x, 2);
mesh::Geometry geom = mesh::create_geometry(MPI_COMM_WORLD, *topo, elements,
cells_list, x, 2);

mesh::Mesh<double> mesh(MPI_COMM_WORLD, topo, geom);
std::cout << "num cells = " << mesh.topology()->index_map(2)->size_local()
Expand Down
2 changes: 1 addition & 1 deletion cpp/dolfinx/common/IndexMap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1119,7 +1119,7 @@ graph::AdjacencyList<int> IndexMap::index_to_dest_ranks() const
std::transform(idx_to_rank.begin(), idx_to_rank.end(), data.begin(),
[&dest](auto x) { return dest[x.second]; });

return graph::AdjacencyList<int>(std::move(data), std::move(offsets));
return graph::AdjacencyList(std::move(data), std::move(offsets));
}
//-----------------------------------------------------------------------------
std::vector<std::int32_t> IndexMap::shared_indices() const
Expand Down
98 changes: 58 additions & 40 deletions cpp/dolfinx/common/MPI.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright (C) 2007-2014 Magnus Vikstrøm and Garth N. Wells
// Copyright (C) 2007-2023 Magnus Vikstrøm and Garth N. Wells
//
// This file is part of DOLFINx (https://www.fenicsproject.org)
//
Expand Down Expand Up @@ -191,19 +191,19 @@ std::vector<int> compute_graph_edges_nbx(MPI_Comm comm,
/// processes. Data is not duplicated across ranks. The global index of
/// a row is its local row position plus the offset for the calling
/// process. The post office rank for a row is determined by applying
/// MPI::index_owner to the global index, and the row is then sent to
/// the post office rank. The function returns that row data for which
/// the caller is the post office.
/// dolfinx::MPI::index_owner to the global index, and the row is then
/// sent to the post office rank. The function returns that row data for
/// which the caller is the post office.
///
/// @param[in] comm MPI communicator
/// @param[in] x Data to distribute (2D, row-major layout)
/// @param[in] shape The global shape of `x`
/// @param[in] comm MPI communicator.
/// @param[in] x Data to distribute (2D, row-major layout).
/// @param[in] shape The global shape of `x`.
/// @param[in] rank_offset The rank offset such that global index of
/// local row `i` in `x` is `rank_offset + i`. It is usually computed
/// using `MPI_Exscan`.
/// @returns (0) local indices of my post office data and (1) the data
/// (row-major). It **does not** include rows that are in `x`, i.e. rows
/// for which the calling process is the post office
/// for which the calling process is the post office.
template <typename U>
std::pair<std::vector<std::int32_t>,
std::vector<typename std::remove_reference_t<typename U::value_type>>>
Expand All @@ -217,21 +217,21 @@ distribute_to_postoffice(MPI_Comm comm, const U& x,
/// This function determines local neighborhoods for communication, and
/// then using MPI neighbourhood collectives to exchange data. It is
/// scalable if the neighborhoods are relatively small, i.e. each
/// process communicated with a modest number of other processes
/// process communicated with a modest number of other processes.
///
/// @param[in] comm The MPI communicator
/// @param[in] comm The MPI communicator.
/// @param[in] indices Global indices of the data (row indices) required
/// by calling process
/// by calling process.
/// @param[in] x Data (2D array, row-major) on calling process which may
/// be distributed (by row). The global index for the `[0, ..., n)`
/// local rows is assumed to be the local index plus the offset for this
/// rank.
/// @param[in] shape The global shape of `x`
/// @param[in] shape The global shape of `x`.
/// @param[in] rank_offset The rank offset such that global index of
/// local row `i` in `x` is `rank_offset + i`. It is usually computed
/// using `MPI_Exscan`.
/// @return The data for each index in `indices` (row-major storage)
/// @pre `shape1 > 0`
/// using `MPI_Exscan` on `comm1` from MPI::distribute_data.
/// @return The data for each index in `indices` (row-major storage).
/// @pre `shape1 > 0`.
template <typename U>
std::vector<typename std::remove_reference_t<typename U::value_type>>
distribute_from_postoffice(MPI_Comm comm, std::span<const std::int64_t> indices,
Expand All @@ -246,20 +246,22 @@ distribute_from_postoffice(MPI_Comm comm, std::span<const std::int64_t> indices,
/// scalable if the neighborhoods are relatively small, i.e. each
/// process communicated with a modest number of other processes.
///
/// @param[in] comm The MPI communicator
/// @param[in] comm0 Communicator to distribute data across.
/// @param[in] indices Global indices of the data (row indices) required
/// by calling process
/// @param[in] x Data (2D array, row-major) on calling process which may
/// be distributed (by row). The global index for the `[0, ..., n)`
/// local rows is assumed to be the local index plus the offset for this
/// rank.
/// by the calling process.
/// @param[in] comm1 Communicator across which x is distributed. Can be
/// `MPI_COMM_NULL` on ranks where `x` is empty.
/// @param[in] x Data (2D array, row-major) on calling process to be
/// distributed (by row). The global index for the `[0, ..., n)` local
/// rows is assumed to be the local index plus the offset for this rank
/// on `comm1`.
/// @param[in] shape1 The number of columns of the data array `x`.
/// @return The data for each index in `indices` (row-major storage)
/// @return The data for each index in `indices` (row-major storage).
/// @pre `shape1 > 0`
template <typename U>
std::vector<typename std::remove_reference_t<typename U::value_type>>
distribute_data(MPI_Comm comm, std::span<const std::int64_t> indices,
const U& x, int shape1);
distribute_data(MPI_Comm comm0, std::span<const std::int64_t> indices,
MPI_Comm comm1, const U& x, int shape1);

template <typename T>
struct dependent_false : std::false_type
Expand Down Expand Up @@ -309,6 +311,7 @@ distribute_to_postoffice(MPI_Comm comm, const U& x,
std::array<std::int64_t, 2> shape,
std::int64_t rank_offset)
{
assert(rank_offset >= 0 or x.empty());
using T = typename std::remove_reference_t<typename U::value_type>;

const int size = dolfinx::MPI::size(comm);
Expand Down Expand Up @@ -459,6 +462,7 @@ distribute_from_postoffice(MPI_Comm comm, std::span<const std::int64_t> indices,
const U& x, std::array<std::int64_t, 2> shape,
std::int64_t rank_offset)
{
assert(rank_offset >= 0 or x.empty());
using T = typename std::remove_reference_t<typename U::value_type>;

common::Timer timer("Distribute row-wise data (scalable)");
Expand All @@ -473,19 +477,19 @@ distribute_from_postoffice(MPI_Comm comm, std::span<const std::int64_t> indices,

// Send receive x data to post office (only for rows that need to be
// communicated)
auto [post_indices, post_x] = MPI::distribute_to_postoffice(
auto [post_indices, post_x] = dolfinx::MPI::distribute_to_postoffice(
comm, x, {shape[0], shape[1]}, rank_offset);
assert(post_indices.size() == post_x.size() / shape[1]);

// 1. Send request to post office ranks for data

// Build list of (src, global index, global, index positions) for each
// Build list of (src, global index, global, index position) for each
// entry in 'indices' that doesn't belong to this rank, then sort
std::vector<std::tuple<int, std::int64_t, std::int32_t>> src_to_index;
for (std::size_t i = 0; i < indices.size(); ++i)
{
std::size_t idx = indices[i];
if (int src = MPI::index_owner(size, idx, shape[0]); src != rank)
if (int src = dolfinx::MPI::index_owner(size, idx, shape[0]); src != rank)
src_to_index.push_back({src, idx, i});
}
std::sort(src_to_index.begin(), src_to_index.end());
Expand Down Expand Up @@ -559,14 +563,14 @@ distribute_from_postoffice(MPI_Comm comm, std::span<const std::int64_t> indices,
err = MPI_Comm_free(&neigh_comm0);
dolfinx::MPI::check_error(comm, err);

// 2. Send data (rows of x) back to requesting ranks (transpose of the
// preceding communication pattern operation)
// 2. Send data (rows of x) from post office back to requesting ranks
// (transpose of the preceding communication pattern operation)

// Build map from local index to post_indices position. Set to -1 for
// data that was already on this rank and was therefore was not
// sent/received via a postoffice.
const std::array<std::int64_t, 2> postoffice_range
= MPI::local_range(rank, shape[0], size);
= dolfinx::MPI::local_range(rank, shape[0], size);
std::vector<std::int32_t> post_indices_map(
postoffice_range[1] - postoffice_range[0], -1);
for (std::size_t i = 0; i < post_indices.size(); ++i)
Expand Down Expand Up @@ -643,7 +647,8 @@ distribute_from_postoffice(MPI_Comm comm, std::span<const std::int64_t> indices,
}
else
{
if (int src = MPI::index_owner(size, index, shape[0]); src == rank)
if (int src = dolfinx::MPI::index_owner(size, index, shape[0]);
src == rank)
{
// In my post office bag
auto local_index = index - postoffice_range[0];
Expand All @@ -668,21 +673,34 @@ distribute_from_postoffice(MPI_Comm comm, std::span<const std::int64_t> indices,
//---------------------------------------------------------------------------
template <typename U>
std::vector<typename std::remove_reference_t<typename U::value_type>>
distribute_data(MPI_Comm comm, std::span<const std::int64_t> indices,
const U& x, int shape1)
distribute_data(MPI_Comm comm0, std::span<const std::int64_t> indices,
MPI_Comm comm1, const U& x, int shape1)
{
assert(shape1 > 0);
assert(x.size() % shape1 == 0);
const std::int64_t shape0_local = x.size() / shape1;

std::int64_t shape0(0), rank_offset(0);
int err
= MPI_Allreduce(&shape0_local, &shape0, 1, MPI_INT64_T, MPI_SUM, comm);
dolfinx::MPI::check_error(comm, err);
err = MPI_Exscan(&shape0_local, &rank_offset, 1, MPI_INT64_T, MPI_SUM, comm);
dolfinx::MPI::check_error(comm, err);
int err;
std::int64_t shape0 = 0;
err = MPI_Allreduce(&shape0_local, &shape0, 1, MPI_INT64_T, MPI_SUM, comm0);
dolfinx::MPI::check_error(comm0, err);

std::int64_t rank_offset = -1;
if (comm1 != MPI_COMM_NULL)
{
rank_offset = 0;
err = MPI_Exscan(&shape0_local, &rank_offset, 1, MPI_INT64_T, MPI_SUM,
comm1);
dolfinx::MPI::check_error(comm1, err);
}
else
{
rank_offset = -1;
if (!x.empty())
throw std::runtime_error("Non-empty data on null MPI communicator");
}

return distribute_from_postoffice(comm, indices, x, {shape0, shape1},
return distribute_from_postoffice(comm0, indices, x, {shape0, shape1},
rank_offset);
}
//---------------------------------------------------------------------------
Expand Down
4 changes: 2 additions & 2 deletions cpp/dolfinx/graph/ordering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,8 @@ create_level_structure(const graph::AdjacencyList<int>& graph, int s)
++l;
}

return graph::AdjacencyList<int>(std::move(level_structure),
std::move(level_offsets));
return graph::AdjacencyList(std::move(level_structure),
std::move(level_offsets));
}

//-----------------------------------------------------------------------------
Expand Down
3 changes: 1 addition & 2 deletions cpp/dolfinx/graph/partition.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -386,8 +386,7 @@ graph::build::compute_local_to_global(std::span<const std::int64_t> global,
if (global.size() != local.size())
throw std::runtime_error("Data size mismatch.");

const std::int32_t max_local_idx
= *std::max_element(local.begin(), local.end());
std::int32_t max_local_idx = *std::max_element(local.begin(), local.end());
std::vector<std::int64_t> local_to_global_list(max_local_idx + 1, -1);
for (std::size_t i = 0; i < local.size(); ++i)
{
Expand Down
9 changes: 3 additions & 6 deletions cpp/dolfinx/graph/partition.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,13 @@ namespace dolfinx::graph

/// @brief Signature of functions for computing the parallel
/// partitioning of a distributed graph.
// See https://github.com/doxygen/doxygen/issues/9552
/// @cond
/// @param[in] comm MPI Communicator that the graph is distributed
/// across
/// @param[in] nparts Number of partitions to divide graph nodes into
/// @param[in] local_graph Node connectivity graph
/// @param[in] ghosting Flag to enable ghosting of the output node
/// distribution
/// @return Destination rank for each input node
/// @endcond
using partition_fn = std::function<graph::AdjacencyList<std::int32_t>(
MPI_Comm, int, const AdjacencyList<std::int64_t>&, bool)>;

Expand Down Expand Up @@ -107,11 +104,11 @@ compute_ghost_indices(MPI_Comm comm,
/// starting from zero, compute a local-to-global map for the links.
/// Both adjacency lists must have the same shape.
///
/// @param[in] global Adjacency list with global link indices
/// @param[in] local Adjacency list with local, contiguous link indices
/// @param[in] global Adjacency list with global link indices.
/// @param[in] local Adjacency list with local, contiguous link indices.
/// @return Map from local index to global index, which if applied to
/// the local adjacency list indices would yield the global adjacency
/// list
/// list.
std::vector<std::int64_t>
compute_local_to_global(std::span<const std::int64_t> global,
std::span<const std::int32_t> local);
Expand Down
Loading

0 comments on commit 6dac2f6

Please sign in to comment.