Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduction functors (cuco::static_reduction_map refactoring 1/N) #187

Closed
wants to merge 12 commits into from

Conversation

sleeepyjack
Copy link
Collaborator

This PR is part 1/N of the refactoring effort for PR #98

New design for reduction functors that can be used by cuco::static_reduction_map.

Implements the following ideas from @jrhemstad (link):

Here's what I was thinking. A person has 3 options for the ReductionOp

  1. Use one of the provided cuco::reduce_* types.

    • No additional work should be required. Partial specialization could/should remove the ReductionOp argument from the constructor
  2. Provide a unsynchronized binary callable T F(T, T) and Identity value

  • This needs to be wrapped by custom_op to apply F in a CAS loop
  • Ideally we could detect this kind of callable and implicitly wrap it in custom_op
  1. Provide a synchronized binary callable T F(atomic_ref<T, Scope>, T) and Identity value
  • User responsible for correct synchronization through atomic_ref

Examples:

// 1.
// no need to provide `reduce_add{}` 
// No need to provide identity value
cuco::static_reduction_map<cuco::reduce_add<int>, int, int> add_map{capacity, empty_key, alloc}; 

// 2. Unsynchronized binary callable must be wrapped in `custom_op`
struct unsync_add{ 
   int identity = 0; // Must provide identity value
   int operator()(int a, int b){ return a + b; }
};

// internally should wrap `unsync_add` in `custom_op`
cuco::static_reduction_map<unsync_add, int, int> custom_unsync_add_map(capacity, empty_key, unsync_add{}, alloc);

// 3.
stuct sync_add{
   int identity = 0; // Must provide identity value
   template <thread_scope Scope>
   int operator()(atomic_ref<int, Scope> a, int b){ return a.fetch_add(b, memory_order_relaxed); }
};

cuco::static_reduction_map<sync_add, int, int> custom_sync_add_map(capacity, empty_key, sync_add{}, alloc);

Includes changes from PR #186

@sleeepyjack sleeepyjack force-pushed the feature/reduction_functors branch from ef15401 to 2a8a50f Compare July 11, 2022 13:43
@PointKernel
Copy link
Member

@sleeepyjack which version of clang-format are you using locally?

@sleeepyjack
Copy link
Collaborator Author

sleeepyjack commented Jul 11, 2022

@sleeepyjack which version of clang-format are you using locally?

@PointKernel

$ clang-format --version
Ubuntu clang-format version 14.0.0-1ubuntu1

but I don't have pre-commit setup locally since I thought our CI does this task automatically.

@sleeepyjack sleeepyjack force-pushed the feature/reduction_functors branch from 16e4ed2 to 905edd8 Compare July 11, 2022 16:38
@PointKernel
Copy link
Member

@sleeepyjack which version of clang-format are you using locally?

@PointKernel

$ clang-format --version
Ubuntu clang-format version 14.0.0-1ubuntu1

but I don't have pre-commit setup locally since I thought our CI does this task automatically.

No worries. CI will indeed fix style issues. I'm just curious about which version introduces the formatting difference.

@PointKernel PointKernel added the type: feature request New feature request label Jul 11, 2022
…urn statement in an `if constexpr else` statement.
@sleeepyjack sleeepyjack force-pushed the feature/reduction_functors branch from 31a85e6 to 9cbe890 Compare July 12, 2022 10:13
@sleeepyjack sleeepyjack marked this pull request as draft July 12, 2022 13:06
@sleeepyjack
Copy link
Collaborator Author

I marked this PR as a draft since I might need to add some more changes while developing the other parts of the static_reduction_map. It's good to go for review though.

@PointKernel PointKernel added the Needs Review Awaiting reviews before merging label Jul 12, 2022
@sleeepyjack
Copy link
Collaborator Author

Thanks @PointKernel for the review! I have incorporated your suggestions into the newest commits.

@PointKernel
Copy link
Member

rerun tests

Comment on lines 138 to 154
static constexpr bool atomic_const_invocable_ =
cuda::std::is_invocable_r_v<value_type,
Func,
cuda::atomic<value_type, cuda::thread_scope_system> const&,
value_type> ||
cuda::std::is_invocable_r_v<value_type,
Func,
cuda::atomic<value_type, cuda::thread_scope_device> const&,
value_type> ||
cuda::std::is_invocable_r_v<value_type,
Func,
cuda::atomic<value_type, cuda::thread_scope_block> const&,
value_type> ||
cuda::std::is_invocable_r_v<value_type,
Func,
cuda::atomic<value_type, cuda::thread_scope_thread> const&,
value_type>;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Listing all possible values for cuda::thread_scope is very ugly. Does anyone have a better solution?

@sleeepyjack sleeepyjack force-pushed the feature/reduction_functors branch from 60d0aab to 4435d05 Compare August 1, 2022 10:10
@sleeepyjack
Copy link
Collaborator Author

Closing this for now as the design will change a lot with the new refactoring

@sleeepyjack sleeepyjack closed this Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review Awaiting reviews before merging type: feature request New feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants