Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/efa: Introduce efa_base_ep, change directory structure, clean up efa_ep and fix a bug in debug path #8406

Merged
merged 4 commits into from
Jan 13, 2023

Conversation

sunkuamzn
Copy link
Contributor

@sunkuamzn sunkuamzn commented Jan 10, 2023

This PR is for several changes

Introduce struct efa_base_ep

This struct contains members common to both DGRAM and RDM endpoints and will be the first member in both endpoints. The RDM endpoint is still defined in rxr.h. The DGRAM endpoint is now defined in a new file called efa_dgram.h

Change directory structure

The intended directory structure is that files related to DGRAM provider only should go in prov/efa/src/dgram. Files related to RDM provider only should go in prov/efa/src/rdm. Only files shared by both providers should remain in prov/efa/src.

This PR doesn't fully accomplish this change.

  • I need to go over the files in prov/efa/src and separate out the code that is only used RDM/DGRAM and move that into their respective folders.
  • rxr_ep still has rdm_ep as a member. So we still need to import dgram/efa_dgram.h in a few places in RDM code. The rdm_ep member and those imports will be removed in future commits.

Remove efa_ep->hmem_p2p_opt and efa_ep->util_ep_initialized

hmem_p2p_opt is only used in RDM code and util_ep_initialized doesn't seem to be necessary

And a fix for a bug introduced in #8391

@a-szegel
Copy link
Contributor

a-szegel commented Jan 10, 2023

Can we change the name rxr to be rdm? RXR seems legacy now. This can happen in a future patch.

@sunkuamzn
Copy link
Contributor Author

Can we change the name rxr to be rdm? RXR seems legacy now. This can happen in a future patch.

Yes, we will do that in future commits. Don't want to put everything in one commit or PR.

@sunkuamzn
Copy link
Contributor Author

bot:aws:retest

@sunkuamzn
Copy link
Contributor Author

sunkuamzn commented Jan 11, 2023

Windows build must be failing because I didn't update libfabric.vcxproj. Will fix in next push.

Will also rebase onto main in next push.

prov/efa/src/efa_base_ep.c should not be in this PR. Will remove in next push.

All fixed

@sunkuamzn sunkuamzn force-pushed the ep-squash-base-ep branch 2 times, most recently from 4d90823 to bbb5ac1 Compare January 13, 2023 01:38
Copy link
Contributor

@wzamazon wzamazon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM.

I think it would be better to add more information in the commit message of the commit that introduced efa_base_ep. In the commit, "rxr_ep" has base_ep in it, but rxr_ep did not actually use the qp inside base_ep. It is still using qp inside rdm_ep.

Next PR will get rid of rdm_ep inside rxr_ep and make it use base_ep.qp

@wzamazon
Copy link
Contributor

Looks like there are still conflict with rxr_cq.c

Signed-off-by: Sai Sunku <sunkusa@amazon.com>
This commit fixes a bug that was introduced in
ofiwg#8391

Signed-off-by: Sai Sunku <sunkusa@amazon.com>
This struct contains members that are common to both efa_ep and rxr_ep

In this commit, rxr_ep has a member efa_base_ep but it does not use the
members of efa_base_ep (CQ, AV, QP, etc) for sending and receiving
messages. rxr_ep still uses rxr_ep->rdm_ep->base_ep

In future commits, rxr_ep->rdm_ep will be removed and rxr_ep will use
the members of efa_base_ep

Signed-off-by: Sai Sunku <sunkusa@amazon.com>
The DGRAM endpoint doesn't support HMEM, so efa_ep doesn't need the
member

Signed-off-by: Sai Sunku <sunkusa@amazon.com>
@sunkuamzn
Copy link
Contributor Author

sunkuamzn commented Jan 13, 2023

I think it would be better to add more information in the commit message of the commit that introduced efa_base_ep.

Added

Looks like there are still conflict with rxr_cq.c

Conflict was because of Darryl's commit which was merged this morning. But the rebase was clean, so it should be fine.

@wzamazon
Copy link
Contributor

Thanks! waiting for CI to finish

@wzamazon wzamazon merged commit 85d930e into ofiwg:main Jan 13, 2023
@sunkuamzn sunkuamzn deleted the ep-squash-base-ep branch September 19, 2023 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants