Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable Half in mpi #1759

Merged
merged 14 commits into from
Feb 19, 2025
Merged

enable Half in mpi #1759

merged 14 commits into from
Feb 19, 2025

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Dec 30, 2024

This PR enables half precision in distributed environment by adding custom operation.

one-side operation like accumulation and fetch_and_op does not support custom operation.

Note. Newer version of mpi might support half precision natively (also for one-side operation) if the administrator build it with compiler supporting native half precision and enable the option.

TODO:

  • enable the rest distributed function with half
  • put the custom operation in gko::comm? it does not grow along with #nodes -> create/free when necessary

@yhmtsai yhmtsai added the 1:ST:WIP This PR is a work in progress. Not ready for review. label Dec 30, 2024
@yhmtsai yhmtsai self-assigned this Dec 30, 2024
@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners mod:all This touches all Ginkgo modules. labels Dec 30, 2024
@yhmtsai yhmtsai added 1:ST:ready-for-review This PR is ready for review and removed 1:ST:WIP This PR is a work in progress. Not ready for review. labels Jan 2, 2025
@yhmtsai yhmtsai requested a review from a team January 2, 2025 08:30
@MarcelKoch MarcelKoch self-requested a review January 7, 2025 08:03
Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mainly concerned about using device buffers for the custom operations, and maybe moving the operations into a private header.

Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing the heap allocation for predefined mpi ops, rest looks good.

} // namespace detail


using op_manager = std::shared_ptr<MPI_Op>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just store the MPI_Op in a struct, so that you don't need to allocate/free anything for predefined MPI_Ops.
You could also keep the unique_ptr for the custom op, and only use the struct for the predefined ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it to class. Is it something in your mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that looks like a good approach.

@pratikvn pratikvn self-requested a review January 17, 2025 09:17
Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some using ... statements are unused, but otherwise LGTM!

Comment on lines 20 to 68
namespace detail {


template <typename ValueType>
inline void min(void* input, void* output, int* len, MPI_Datatype* datatype)
{
ValueType* input_ptr = static_cast<ValueType*>(input);
ValueType* output_ptr = static_cast<ValueType*>(output);
for (int i = 0; i < *len; i++) {
if (input_ptr[i] < output_ptr[i]) {
output_ptr[i] = input_ptr[i];
}
}
}


} // namespace detail


using gko::experimental::mpi::op_manager;

template <typename ValueType,
std::enable_if_t<std::is_arithmetic_v<ValueType>>* = nullptr>
inline op_manager min()
{
return op_manager(
[]() {
MPI_Op* operation = new MPI_Op;
*operation = MPI_MIN;
return operation;
}(),
[](MPI_Op* op) { delete op; });
}

template <typename ValueType,
std::enable_if_t<!std::is_arithmetic_v<ValueType>>* = nullptr>
inline op_manager min()
{
return op_manager(
[]() {
MPI_Op* operation = new MPI_Op;
MPI_Op_create(&detail::min<ValueType>, 1, operation);
return operation;
}(),
[](MPI_Op* op) {
MPI_Op_free(op);
delete op;
});
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be moved to mpi_op.hpp, or entirely removed ? Or is there a reason to have it only here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only implement min here because we currently only use it here.

Comment on lines +90 to +97
// OpenMPI 5.0 have support from MPIX_C_FLOAT16 and MPICHv3.4a1 MPIX_C_FLOAT16
// Only OpenMPI support complex half
// TODO: use native type when mpi is configured with half feature
GKO_REGISTER_MPI_TYPE(half, MPI_UNSIGNED_SHORT);
GKO_REGISTER_MPI_TYPE(std::complex<half>, MPI_FLOAT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need to consider whether other MPI implementations also natively support half, if we just want to use only the native support. Suppercomputers use their own variants: Cray-MPICH (Maybe this is similar to MPICH), IntelMPI etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have discussed it with @MarcelKoch . I think we will go for the custom implementation now, then later we might check whether to do native support. I had tried something in deb12f0 and it already shows it quite not consistent between OpenMPI and MPICH

Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only two small nits left.

} // namespace detail


using op_manager = std::shared_ptr<MPI_Op>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that looks like a good approach.

@yhmtsai yhmtsai requested a review from MarcelKoch February 13, 2025 14:53
@yhmtsai yhmtsai force-pushed the half_mpi branch 2 times, most recently from 742c5e1 to 35d52f2 Compare February 18, 2025 12:34
@yhmtsai yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Feb 18, 2025
@yhmtsai yhmtsai merged commit d7c7a7c into develop Feb 19, 2025
9 of 11 checks passed
@yhmtsai yhmtsai deleted the half_mpi branch February 19, 2025 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. 1:ST:run-full-test mod:all This touches all Ginkgo modules. reg:testing This is related to testing. type:preconditioner This is related to the preconditioners type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants