uring enablement #8

ooststep · 2024-08-09T19:10:29Z

No description provided.

Signed-off-by: Thomas Huber <thomas.huber@cornelisnetworks.com>

Replace running of on-merge workflow with a nightly workflow instead. Signed-off-by: Jack Morrison <jack.morrison@cornelisnetworks.com>

Do not use PR closed events as workflow triggers. Allow triggering PR events when targeting any branch, not just main. Change cron schedule to account for UTC. Improve conditional execution of reusable workflows. Signed-off-by: Jack Morrison <jack.morrison@cornelisnetworks.com>

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

This commit adds support for link up/down events in OPX for WFR platforms. Signed-off-by: Archana Venkatesha <archana.venkatesha@cornelisnetworks.com>

Signed-off-by: Mike Wilkins <michael.wilkins@cornelisnetworks.com>

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

Fix 16B PBC/payload lengths Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

Signed-off-by: Mike Wilkins <michael.wilkins@cornelisnetworks.com>

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

Signed-off-by: Mike Wilkins <michael.wilkins@cornelisnetworks.com> Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

Signed-off-by: Lindsay Reiser <lindsay.reiser@cornelisnetworks.com>

… rendezvous Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

Add ability to independently tune the minimum threshold to use expected receive (TID) when sending. Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

Shorten the name field of opx-ci. Remove schedule-triggered Nightly job. Signed-off-by: Jack Morrison <jack.morrison@cornelisnetworks.com>

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

Also store pad value not buffer data Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

Signed-off-by: Thomas Huber <thomas.huber@cornelisnetworks.com>

Signed-off-by: Elias Kozah <elias.elkozah@cornelisnetworks.com>

…resulting from ignored context creation error Signed-off-by: Elias Kozah <Elias.Elkozah@cornelisnetworks.com>

…vous performance Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

The OPX provider was explicitly setting FI_REMOTE_CQ_DATA on all receive operations; however, it should only set the flag to indicate that the data field contains the completion data provided by the peer as part of their transmit request. Signed-off-by: Lindsay Reiser <lindsay.reiser@cornelisnetworks.com>

Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

Signed-off-by: Jack Morrison <jack.morrison@cornelisnetworks.com>

Currently, efa_base_ep's default rnr_retry is 3 which only does a few retry in the firmware level for RNR. This is due to the efa_rdm_ep supports libfabric level RNR retry. However, the efa-direct ep doesn't support libfabric level RNR retry. Then we should make it do infinite RNR retry (7), which is also the default behavior of SRD QP. Signed-off-by: Shi Jin <sjina@amazon.com>

This commit removes the x86-64 architecture check from the static_assert conditional compilation directive. The static_assert feature is not architecture-dependent and should be checked on all platforms that support it. Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>

Store the completion flags and peer address in FI_CONTEXT2 and retrieve later when writing cq. Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>

Signed-off-by: Sai Sunku <sunkusa@amazon.com>

Other memory monitors, such as CUDA, ROCR, and ZE, have a .c file for the implementation. This change cleans up the util_mem_monitor.c code by defining a uffd and import .c file, thus aligning to other memory monitor implementations. Signed-off-by: Mike Uttormark <mike.uttormark@hpe.com> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

Some memory monitors, such as kdreg2, have a subscription context per MR cache entry. These memory monitors require unsubscribe to be called for each freed MR cache entry. To support this, call unsubscribe when an entry is remove from the MR cache RB tree. If a memory monitor does not support a subscription context per MR, unsubscribe must be implemented as a noop. Update uffd and rocr memory monitors accordingly. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

ROCR deallocation CB will call rocr_unsubscribe with mm_lock held. If memhooks is used, since rocr_unsubscribe may call free, this can result in memhooks intercepting the free and leading to deadlock. To avoid this, freeing is deferred until locks are released. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

Subscribe, unsubscribe, and valid are callbacks which are dynamically setup. Change this to be statically set. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

An MR cache utilizing kdreg2 will have incorrect MR cache count stats if unsubscribe is not called. Signed-off-by: Mike Uttormark <mike.uttormark@hpe.com> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

prior to this patch, when efa_hmem_info_check_p2p_support_cuda elected to attempt dmabuf for p2p, we previously leaked the file descriptor returned by cuMemGetHandleForAddressRange in all cases. This ultimately meant the dmabuf stuck around for the lifetime of the process, even after dereg and after releasing the memory back to the device mempool. All calls to cuda_get_dmabuf_fd need a corresponding close call. Signed-off-by: Nicholas Sielicki <nslick@amazon.com>

For some HMEM ifaces, ofi_hmem_get_dmabuf_fd() may result in a new FD being allocated. Define ofi_hmem_put_dmabuf_fd() to close FD. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

With ROCR, callers of ofi_hmem_get_dmabuf_fd() should call ofi_hmem_put_dmabuf_fd() once the DMA buf region is no longer used. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

With CUDA, callers of ofi_hmem_get_dmabuf_fd() should call ofi_hmem_put_dmabuf_fd() once the DMA buf region is no longer used. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

Performing multiple HSA allocations appears to result in a DMA buf offset. Verify that the CXI provider can register a DMA buf offset memory region. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

When a MR is freed, the CXI provider should free the DMA buf FD used for the ROCR region. Failing to do this will result in FDs being exhausted. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

When a MR is freed, the CXI provider should free the DMA buf FD used for the CUDA region. Failing to do this will result in FDs being exhausted. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

This allows testing FI_CONTEXT2 in providers that require this mode bit. Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>

Use cuda_put_dmabuf_fd to close fd Signed-off-by: Shi Jin <sjina@amazon.com>

Signed-off-by: Zach Dworkin <zachary.dworkin@intel.com>

Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>

When multiple multi-recv buffers are posted, FI_MULTI_RECV would only be set on error if an mrecv entry was already created, meaning the buffer would have already been in-use. If the buffer has not been used yet and a cancelation for this buffer has been processed, correctly set FI_MULTI_RECV when reporting the error, indicating that the buffer is no longer in use. Signed-off-by: Jerome Soumagne <jerome.soumagne@hpe.com>

This ensures that the libcurl dlopen path is correct If the user passes '--with-curl=<path>' to configure, then the dlopen of libcurl should honor that selection and use the file path passed in Signed-off-by: John Biddiscombe <biddisco@cscs.ch>

Signed-off-by: John Biddiscombe <biddisco@cscs.ch>

Bumps [actions/stale](https://github.com/actions/stale) from 9.0.0 to 9.1.0. - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](actions/stale@28ca103...5bef64f) --- updated-dependencies: - dependency-name: actions/stale dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.1 to 3.28.5. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@b6a472f...f6091c0) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>

we may receive uring events before we're fully connected so don't try to progress rx until that connection is established Signed-off-by: Stephen Oost <stephen.oost@intel.com>

the previously used io_uring_prep_readv function does not support flags, instead flags were being passed as an offset, triggering an illegal seek error Signed-off-by: Stephen Oost <stephen.oost@intel.com>

ooststep force-pushed the uring branch 3 times, most recently from 9c8b7eb to 9ebf395 Compare August 14, 2024 18:34

ooststep force-pushed the uring branch from 9ebf395 to 2a04e17 Compare August 20, 2024 19:38

tmh97 and others added 26 commits October 18, 2024 10:53

prov/opx: Updated configure.m4 for ROCR

f124b4e

Signed-off-by: Thomas Huber <thomas.huber@cornelisnetworks.com>

github/actions: Adjust Cornelis Networks internal workflows

2335628

Replace running of on-merge workflow with a nightly workflow instead. Signed-off-by: Jack Morrison <jack.morrison@cornelisnetworks.com>

prov/opx: scb/hdr changes

a57e788

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

prov/opx: Link bounce support for OPX WFR

0c1002e

This commit adds support for link up/down events in OPX for WFR platforms. Signed-off-by: Archana Venkatesha <archana.venkatesha@cornelisnetworks.com>

man: Document OPX max ping envvars

4b78fc2

Signed-off-by: Mike Wilkins <michael.wilkins@cornelisnetworks.com>

prov/opx: 16B SDMA header support

ed72c6e

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

prov/opx: Fix uepkt 16B headers

ced60a0

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

prov/opx: Support 16B SDMA CTS work

09b7e35

Fix 16B PBC/payload lengths Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

prov/opx: Remove polling call from internal rma write

20dd5af

Signed-off-by: Mike Wilkins <michael.wilkins@cornelisnetworks.com>

prov/opx: Fix credit return

c7d0fa8

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

prov/opx: added OPX Tracer points to RMA code paths

fd7bac4

Signed-off-by: Mike Wilkins <michael.wilkins@cornelisnetworks.com> Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

prov/opx: Simplify fi_opx_check_rma() function.

49918a6

Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

prov/opx: Initialize nic info in fi_info

6233026

Signed-off-by: Lindsay Reiser <lindsay.reiser@cornelisnetworks.com>

prov/opx: Fix incorrect calculation of immediate block offset in send…

bd45608

… rendezvous Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

prov/opx: Add FI_OPX_TID_MIN_PAYLOAD_BYTES param

b9cd49f

Add ability to independently tune the minimum threshold to use expected receive (TID) when sending. Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

github/actions: Modify Cornelis Networks internal workflows

bf312ce

Shorten the name field of opx-ci. Remove schedule-triggered Nightly job. Signed-off-by: Jack Morrison <jack.morrison@cornelisnetworks.com>

prov/opx: Fix payload copy

99f450e

Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

prov/opx: Fix eager and mp eager

85e00a6

Also store pad value not buffer data Signed-off-by: Bob Cernohous <bob.cernohous@cornelisnetworks.com>

prov/opx: Fix last_bytes field for replay over sdma

4eebbb3

Signed-off-by: Thomas Huber <thomas.huber@cornelisnetworks.com>

prov/opx: fi_info -e fix for FI_OPX_UUID env var

f27d721

Signed-off-by: Elias Kozah <elias.elkozah@cornelisnetworks.com>

prov/opx: Investigate and address indeterminate behavior or segfault …

b75a0be

…resulting from ignored context creation error Signed-off-by: Elias Kozah <Elias.Elkozah@cornelisnetworks.com>

prov/opx: Include less immediate data in RTS packet to improve rendez…

890c201

…vous performance Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

prov/opx: Add debug check for zero-byte length data packets

b88aa97

Signed-off-by: Ben Lynam <Ben.Lynam@cornelisnetworks.com>

github/actions: Remove unused Cornelis Networks formatting workflow

55a9daf

Signed-off-by: Jack Morrison <jack.morrison@cornelisnetworks.com>

shijin-aws and others added 26 commits January 21, 2025 21:19

prov/efa: Implement FI_CONTEXT2 in EFA Direct

3d04127

Store the completion flags and peer address in FI_CONTEXT2 and retrieve later when writing cq. Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>

contrib/aws: Reduce nccl test iteration count

d1fd795

Signed-off-by: Sai Sunku <sunkusa@amazon.com>

prov/util: Statically set uffd callbacks

0eedfbb

Subscribe, unsubscribe, and valid are callbacks which are dynamically setup. Change this to be statically set. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

prov/cxi: Test monitor unsubscribe

2715617

An MR cache utilizing kdreg2 will have incorrect MR cache count stats if unsubscribe is not called. Signed-off-by: Mike Uttormark <mike.uttormark@hpe.com> Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

src/hmem: Define ofi_hmem_put_dmabuf_fd

8378124

For some HMEM ifaces, ofi_hmem_get_dmabuf_fd() may result in a new FD being allocated. Define ofi_hmem_put_dmabuf_fd() to close FD. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

hmem/rocr: Support ofi_hmem_put_dmabuf_fd()

fc611e5

With ROCR, callers of ofi_hmem_get_dmabuf_fd() should call ofi_hmem_put_dmabuf_fd() once the DMA buf region is no longer used. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

hmem/cuda: Support ofi_hmem_put_dmabuf_fd()

5357cee

With CUDA, callers of ofi_hmem_get_dmabuf_fd() should call ofi_hmem_put_dmabuf_fd() once the DMA buf region is no longer used. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

prov/cxi: Integrate with ofi_hmem_put_dmabuf_fd

b1d3bb4

Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

prov/cxi: Test ROCR with DMA buf offset

587e37a

Performing multiple HSA allocations appears to result in a DMA buf offset. Verify that the CXI provider can register a DMA buf offset memory region. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

prov/cxi: Test ROCR with DMA buf FD recycling

4431fe5

When a MR is freed, the CXI provider should free the DMA buf FD used for the ROCR region. Failing to do this will result in FDs being exhausted. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

prov/cxi: Test CUDA with DMA buf FD recycling

ba880cc

When a MR is freed, the CXI provider should free the DMA buf FD used for the CUDA region. Failing to do this will result in FDs being exhausted. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

fabtests: Add support for FI_CONTEXT2

3f4571d

This allows testing FI_CONTEXT2 in providers that require this mode bit. Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>

prov/efa: Use cuda_put_dmabuf_fd

427ab3f

Use cuda_put_dmabuf_fd to close fd Signed-off-by: Shi Jin <sjina@amazon.com>

contrib/intel/jenkins: Add --send-mail for new ci summary

2e8ef33

Signed-off-by: Zach Dworkin <zachary.dworkin@intel.com>

fabtests/efa: add rdma check for unsolicited write recv

b280fc0

Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>

prov/cxi: Make string setup of FI_CXI_CURL_LIB_PATH safe

0e6f63f

Signed-off-by: John Biddiscombe <biddisco@cscs.ch>

ooststep force-pushed the uring branch from 9905731 to f256c16 Compare January 27, 2025 22:06

ooststep added 2 commits January 27, 2025 14:07

prov/tcp: only progress rx when connected

f88f4b6

we may receive uring events before we're fully connected so don't try to progress rx until that connection is established Signed-off-by: Stephen Oost <stephen.oost@intel.com>

prov/tcp: use readv2 when passing flags to io uring

11e570a

the previously used io_uring_prep_readv function does not support flags, instead flags were being passed as an offset, triggering an illegal seek error Signed-off-by: Stephen Oost <stephen.oost@intel.com>

ooststep force-pushed the uring branch from f256c16 to 11e570a Compare January 27, 2025 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uring enablement #8

uring enablement #8

ooststep commented Aug 9, 2024

uring enablement #8

Are you sure you want to change the base?

uring enablement #8

Conversation

ooststep commented Aug 9, 2024