-
Notifications
You must be signed in to change notification settings - Fork 755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UR][SYCL] Add support for bidirectional USM prefetching #16047
base: sycl
Are you sure you want to change the base?
Conversation
28fb2bd
to
d299e93
Compare
…into ianayl/2way-prefetch
@@ -243,6 +243,19 @@ class node_impl : public std::enable_shared_from_this<node_impl> { | |||
return createCGCopy<sycl::detail::CGFillUSM>(); | |||
case sycl::detail::CGType::PrefetchUSM: | |||
return createCGCopy<sycl::detail::CGPrefetchUSM>(); | |||
#ifdef __INTEL_PREVIEW_BREAKING_CHANGES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why getCGCopy() throws an exception here?
In command.cpp line 3046 there is an implementation for these new command group types when using Graphs. If that implementation is working, then I wouldn't expect an exception to be thrown only when a CG copy happens.
void handler::ext_oneapi_prefetch_exp( | ||
const void *Ptr, size_t Count, | ||
ext::oneapi::experimental::prefetch_type Type) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use throwIfGraphAssociated here and in the other new extension function if it doesn't being used in a graph node.
If this does support being used in a graph node, can we have a E2E test to verify that (even if it will be a no-op based on UR command-buffer implementation). For example, in the same location as https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Graph/RecordReplay/ext_oneapi_enqueue_functions_submit.cpp but looking like the prefetch_exp.cpp
test you've already created.
This PR introduces the ability to prefetch USM memory from both host-to-device and device-to-host via a change to the enqueue functions extension's prefetch function.
The API looks like this after my changes:
This will also act as the corresponding intel/llvm testing PR for the UR PR made to implement the adapter changes required in runtime to facilitate this: oneapi-src/unified-runtime#2312
Regarding the CI failures:
I've gone ahead and introduced a new symbol
ext_oneapi_prefetch_d2h
which triggers a fail in check-sycl that tests for ABI breaks. However, the introduction of this new symbol is not actually an ABI break (as stated in the fail message for the test): Thus, to the best of my knowledge this PR will always have 1 CI failure.RFC:
In my current implementation, I introduced a new CG type whose sole purpose is to indicate a prefetch from device-to-host. This was done to prevent modifications to the original CG node for prefetch, which would cause ABI breaks. I also have an implementation of a CG type for a unified prefetch node that is capable of representing both via a class member in __INTEL_PREVIEW_BREAKING_CHANGES.
However, in hindsight, I am not sure which implementation would be preferred in the long run: having a separate CG type for prefetch from host-to-device vs device-to-host would make e.g. graph analysis marginally easier, as it would be possible to determine prefetch direction without needing to access members of the prefetch node class. However, it does mean prefetch is now represented as 2 CG types. The difference is miniscule, but I'm still curious as to what people prefer, and if I really need the changes in __INTEL_PREVIEW_BREAKING_CHANGES.