-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tcp mt #7
Conversation
6499fa6
to
17fd7cc
Compare
1146426
to
838a840
Compare
68e1651
to
bdd6d5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it took so long. Wanted to be thorough and also kept getting distracted... I think it looks good! Can't see any glaring issues
prov/util/src/util_av.c
Outdated
@@ -313,6 +341,7 @@ int ofi_av_insert_addr(struct util_av *av, const void *addr, fi_addr_t *fi_addr) | |||
FI_INFO(av->prov, FI_LOG_AV, "fi_addr: %" PRIu64 "\n", | |||
ofi_buf_index(entry)); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop line
prov/tcp/src/xnet.h
Outdated
int xnet_multiplex_av_open(struct fid_domain *domain_fid, struct fi_av_attr *attr, | ||
struct fid_av **fid_av, void *context); | ||
|
||
int xnet_domain_open(struct fid_fabric *fabric, struct fi_info *info, | ||
struct fid_domain **domain, void *context); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate definitions of these two calls. I also think the only reason xnet_domain_open was moved up here was to call in create_subdomain which I think can be removed so likely need to undo this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xnet_domain_open is the original default, we add the check to see if the domain is multiplexed there. it already existed, just moved a few lines (see line 547 on the old version of this file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I'm talking about lines 499-503 in the new file - it looks like you added it twice up here. But I don't think they need to move because I think they had to move to call in the new static inline for create_subdomains which is never called. So think this whole block can be dropped except the xnet_domain_multiplexed call. At least that's how I'm reading it
prov/tcp/src/xnet.h
Outdated
struct fid_domain **domain, void *context); | ||
|
||
static inline | ||
int xnet_create_subdomain(struct xnet_domain *domain, struct xnet_domain **subdomain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we ever call create_subdomain anymore - I think we change it to just call fi_domain directly
prov/tcp/src/xnet_av.c
Outdated
@@ -0,0 +1,234 @@ | |||
/* | |||
* Copyright (c) 2017-2022 Intel Corporation. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop years
prov/tcp/src/xnet_av.c
Outdated
*/ | ||
|
||
#include <stdlib.h> | ||
#include <string.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be wrong but double check that this header is needed. I don't see an obvious reason for it
prov/tcp/src/xnet_domain.c
Outdated
struct fid_list_entry *item; | ||
|
||
domain = container_of(fid, struct xnet_domain, util_domain.domain_fid.fid); | ||
while(!dlist_empty(&domain->subdomain_list)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after while
prov/tcp/src/xnet_domain.c
Outdated
domain = calloc(1, sizeof(*domain)); | ||
if (!domain) | ||
return -FI_ENOMEM; | ||
|
||
ret = ofi_domain_init(fabric_fid, info, &domain->util_domain, context, | ||
OFI_LOCK_NONE); | ||
info->domain_attr->threading == FI_THREAD_SAFE ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember why we added this - was it a separate optimization or was it needed for these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should actually be removed. think it's left over from some earlier changes and likely has no real affect (since our new changes only apply to FI_THREAD_COMPLETION and this code is in the non-FI_THREAD_COMPLETION path, it should remain unchanged)
prov/tcp/src/xnet_progress.c
Outdated
for (i = 0; i < resp->hdr.base_hdr.rma_iov_cnt; i++) { | ||
ret = ofi_mr_verify(&ep->util_ep.domain->mr_map, rma_iov[i].len, | ||
(uintptr_t *) &rma_iov[i].addr, | ||
ret = ofi_mr_verify(&domain->util_domain.mr_map, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change was added before we moved everything to use the xnet_domain so I think we can take this out
prov/tcp/src/xnet_rdm.c
Outdated
.flags = 0, | ||
}; | ||
size_t addr_size; | ||
char addr[sizeof(struct sockaddr_in6)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a line after this
prov/tcp/src/xnet_rdm.c
Outdated
return NULL; | ||
} | ||
|
||
void set_subdomain(struct xnet_rdm *rdm, struct xnet_domain *domain, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static
b3acd52
to
225862c
Compare
prov/tcp/src/xnet.h
Outdated
int xnet_domain_open(struct fid_fabric *fabric, struct fi_info *info, | ||
struct fid_domain **domain, void *context); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this back down since it's not needed anymore (unless you want to have it up here for organization purposes 🤷♀️ )
eb38aac
to
1d50324
Compare
prov/util/src/util_mr_map.c
Outdated
@@ -52,7 +52,6 @@ dup_mr_attr(const struct fi_mr_attr *attr, uint64_t flags) | |||
dup_attr->mr_iov = (struct iovec *) (dup_attr + 1); | |||
|
|||
/* | |||
* dup_mr_attr is only used insided ofi_mr_map_insert. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry one last comment - this was changed from static and declared in the header but I don't see it called in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably leftover crud from a prior attempt. will revert
0587a3f
to
61a00b5
Compare
replace the debug-only allocated check with a check usable in release builds by checking presence in the free list Co-authored-by: Alexia Ingerson <alexia.ingerson@intel.com> Signed-off-by: Stephen Oost <stephen.oost@intel.com>
there situations where being able to insert addresses into specific indexes of the av would be convienient example: enable the tcp provider to duplicate established avs into other domains Co-authored-by: Alexia Ingerson <alexia.ingerson@intel.com> Signed-off-by: Stephen Oost <stephen.oost@intel.com>
Co-authored-by: Alexia Ingerson <alexia.ingerson@intel.com> Signed-off-by: Stephen Oost <stephen.oost@intel.com>
By default, the tcp provider is optimized for single threaded applications that make use of FI_THREAD_DOMAIN but is inefficient for multithreaded applications that intend to follow FI_THREAD_COMPLETION semantics. The primary limiting factoe is that the progress engine is tied to the domain, and all endpoint under a single domain are synchronized by this lone progress engine In an effort to increase efficiency and performance of these multithreaded applications, multiplex the application's domain reference into a subdomain per ep, each having their own progress engine Co-authored-by: Alexia Ingerson <alexia.ingerson@intel.com> Signed-off-by: Stephen Oost <stephen.oost@intel.com>
No description provided.