-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SYCL MDRangePolicy parallel_reduce #3801
Conversation
11512b4
to
9f02724
Compare
To make independent progress, I dropped the changes to |
It might be possible to combine the implementation a little better to avoid code duplication but I would prefer to do that in a follow-up pull request possibly after implementing the outer |
9f02724
to
16d3572
Compare
Retest this please. |
Retest this please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to split the kernel but maybe not ...
|
||
const BarePolicy bare_policy = m_policy; | ||
|
||
cgh.parallel_for(range, [=](sycl::nd_item<1> item) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may wanna do this in two different parallel for calls instead of having the if(first_run ) {} else {} inside the kernel. Effectively you are paying the full register cost for every subsequent call too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was also thinking about splitting it but decided to rather try to keep it similar to the other SYCL parallel_reduce
implementations for now.
I prefer optimizing and maybe fusing the parallel_reduce
implementations some more once they are all merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Depends on #3802.