Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/async kernel submition #1219

Merged
merged 2 commits into from
Nov 30, 2023
Merged

Conversation

ZzEeKkAa
Copy link
Contributor

@ZzEeKkAa ZzEeKkAa commented Nov 15, 2023

Add new call_kenel_asyn function to experimental package to submit kernel without waiting till it's done execution.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • If this PR is a work in progress, are you filing the PR as a draft?

@ZzEeKkAa ZzEeKkAa self-assigned this Nov 15, 2023
@ZzEeKkAa ZzEeKkAa changed the base branch from main to feature/overload_syclevent November 15, 2023 22:42
Base automatically changed from feature/overload_syclevent to main November 15, 2023 23:29
@ZzEeKkAa ZzEeKkAa force-pushed the feature/async_kernel_submition branch 3 times, most recently from c0cfee5 to 5c1a60d Compare November 16, 2023 21:34
@ZzEeKkAa ZzEeKkAa force-pushed the feature/async_kernel_submition branch 5 times, most recently from 914dcd5 to c7b22d1 Compare November 28, 2023 17:36
@ZzEeKkAa ZzEeKkAa force-pushed the feature/async_kernel_submition branch 2 times, most recently from 036e301 to 1ebbe2e Compare November 28, 2023 23:08
@ZzEeKkAa ZzEeKkAa changed the title WIP: Feature/async kernel submition Feature/async kernel submition Nov 28, 2023
index_space,
kernel_args,
)
device_event.wait() # pylint: disable=E1101
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case, how will the device_event object get cleaned up? Previosuly, we had sycl.dpctl_event_delete(self._builder, eref).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap_event_reference allocates meminfo and sycl event object becomes smart pointer. Destructor of meminfo will call DPCTLEvent_Delete which is the same function called by dpctl_event_delete https://github.com/IntelPython/numba-dpex/blob/main/numba_dpex/core/runtime/_eventstruct.c#L19

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the log of async call:

DPEXRT-DEBUG: In DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 805
DPEXRT-DEBUG: usm type = 1 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 282.
DPEXRT-DEBUG: NRT_MemInfo_init mi=0x561537ebfc00 external_allocator=0x561537e91fb0 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 476
DPEXRT-DEBUG: Done with unboxing call to DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 885
DPEXRT-DEBUG: In DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 805
DPEXRT-DEBUG: usm type = 1 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 282.
DPEXRT-DEBUG: NRT_MemInfo_init mi=0x561537b7c1c0 external_allocator=0x561537df11b0 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 476
DPEXRT-DEBUG: Done with unboxing call to DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 885
DPEXRT-DEBUG: In DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 805
DPEXRT-DEBUG: usm type = 1 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 282.
DPEXRT-DEBUG: NRT_MemInfo_init mi=0x561537b49c50 external_allocator=0x561537fb8120 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 476
DPEXRT-DEBUG: Done with unboxing call to DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 885
DPEXRT-DEBUG: creating new dpctl event meminfo.
DPEXRT-DEBUG: scheduling nrt meminfo release.
DPEXRT-DEBUG: acquired meminfo.
DPEXRT-DEBUG: creating new dpctl event meminfo.
DPEXRT-DEBUG: creating new event object.
DPEXRT-DEBUG: released meminfo from host_task.
DPEXRT-DEBUG: deleting dpctl event reference.
DPEXRT-DEBUG: creating new event object.
DPEXRT-DEBUG: deleting dpctl event reference.
DPEXRT-DEBUG: released meminfo from host_task.
DPEXRT-DEBUG: released meminfo from host_task.

both dpctl event reference getting deleted.

And for call_kernel:

DPEXRT-DEBUG: In DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 805
DPEXRT-DEBUG: usm type = 1 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 282.
DPEXRT-DEBUG: NRT_MemInfo_init mi=0x55f94f0452d0 external_allocator=0x55f94f4a6260 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 476
DPEXRT-DEBUG: Done with unboxing call to DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 885
DPEXRT-DEBUG: In DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 805
DPEXRT-DEBUG: usm type = 1 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 282.
DPEXRT-DEBUG: NRT_MemInfo_init mi=0x55f94f3f3470 external_allocator=0x55f94f403860 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 476
DPEXRT-DEBUG: Done with unboxing call to DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 885
DPEXRT-DEBUG: In DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 805
DPEXRT-DEBUG: usm type = 1 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 282.
DPEXRT-DEBUG: NRT_MemInfo_init mi=0x55f94f044c40 external_allocator=0x55f94f0e5b30 at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 476
DPEXRT-DEBUG: Done with unboxing call to DPEXRT_sycl_usm_ndarray_from_python at /home/yevhenii/Projects/numba-dpex/numba_dpex/core/runtime/_dpexrt_python.c, line 885
DPEXRT-DEBUG: creating new dpctl event meminfo.
DPEXRT-DEBUG: deleting dpctl event reference.

As you can see - it is getting deleted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. But, we need not add the overhead of NRT_MemInfo allocation when we know it is not going to be used. We should just keep the direct call to sycl.dpctl_event_delete for the call_kernel scenario.

@ZzEeKkAa ZzEeKkAa force-pushed the feature/async_kernel_submition branch 3 times, most recently from 31b1122 to 425dabb Compare November 29, 2023 19:41
@ZzEeKkAa ZzEeKkAa force-pushed the feature/async_kernel_submition branch from 425dabb to cb53e93 Compare November 29, 2023 20:29
@ZzEeKkAa ZzEeKkAa marked this pull request as ready for review November 29, 2023 20:29
@ZzEeKkAa ZzEeKkAa force-pushed the feature/async_kernel_submition branch 2 times, most recently from 0ca5372 to d704dd1 Compare November 29, 2023 20:35
@ZzEeKkAa ZzEeKkAa requested a review from diptorupd November 29, 2023 20:35
Copy link
Contributor

@diptorupd diptorupd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge after adding the last two docstrings.

@ZzEeKkAa ZzEeKkAa force-pushed the feature/async_kernel_submition branch from d704dd1 to 707412b Compare November 30, 2023 14:00
@ZzEeKkAa ZzEeKkAa enabled auto-merge November 30, 2023 14:03
@ZzEeKkAa ZzEeKkAa merged commit 04c18bf into main Nov 30, 2023
34 of 42 checks passed
@ZzEeKkAa ZzEeKkAa deleted the feature/async_kernel_submition branch November 30, 2023 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants