-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[group] Refine scan_over_group for sub-group #839
Conversation
Should use `get_max_local_range` to get the maximum sub-group size for the executing kernel. Signed-off-by: Yilong Guo <yilong.guo@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks right. It is easy to forget not all the groups have the same size.
Thanks for reviewing. Please allow me to finish some extra testings. |
Neither
So using the global linear id directly is probably safer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking this to avoid regression.
Reorder ref_input according to the actual sub-group partitioning and ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @Nuullll ! LGTM!
Any further comments? Can we merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
Use global linear id to index input data. Store both the sub-group id and item linear id within the sub-group to recover sub-group construction when verifying, so that we don't make any assumption on the implementation-defined sub-group partitioning and ordering.