Skip to content

Commit

Permalink
prov/efa: Enable medium message protocol for neuron
Browse files Browse the repository at this point in the history
Benchmarking has shown that rdma-read latency is high for neuron memory.
Even with runting read, there is a drop in bandwidth for large window
sizes. Medium protocol provides best performace for intermediate
messages.

Threshold to switch between medium and long read was determined by
benchmarking fi_rdm_tagged_bw

Signed-off-by: Sai Sunku <sunkusa@amazon.com>
  • Loading branch information
sunkuamzn committed Feb 2, 2024
1 parent d473fdd commit 0b75d02
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
5 changes: 4 additions & 1 deletion prov/efa/src/efa.h
Original file line number Diff line number Diff line change
Expand Up @@ -102,12 +102,15 @@


#define EFA_DEFAULT_RUNT_SIZE (307200)
#define EFA_NEURON_RUNT_SIZE (131072)
#define EFA_DEFAULT_INTER_MAX_MEDIUM_MESSAGE_SIZE (65536)
#define EFA_DEFAULT_INTER_MIN_READ_MESSAGE_SIZE (1048576)
#define EFA_DEFAULT_INTER_MIN_READ_WRITE_SIZE (65536)
#define EFA_DEFAULT_INTRA_MAX_GDRCOPY_FROM_DEV_SIZE (3072)

#define EFA_NEURON_RUNT_SIZE (131072)
#define EFA_NEURON_INTER_MAX_MEDIUM_MESSAGE_SIZE (49152)
#define EFA_NEURON_INTER_MIN_READ_MESSAGE_SIZE (49152)

/*
* The default memory alignment
*/
Expand Down
4 changes: 2 additions & 2 deletions prov/efa/src/efa_hmem.c
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,8 @@ static int efa_domain_hmem_info_init_protocol_thresholds(struct efa_domain *efa_
case FI_HMEM_NEURON:
info->runt_size = EFA_NEURON_RUNT_SIZE;
info->max_intra_eager_size = 0;
info->max_medium_msg_size = 0;
info->min_read_msg_size = efa_max_eager_msg_size_with_largest_header(efa_domain) + 1;
info->max_medium_msg_size = EFA_NEURON_INTER_MAX_MEDIUM_MESSAGE_SIZE;
info->min_read_msg_size = EFA_NEURON_INTER_MIN_READ_MESSAGE_SIZE;
info->min_read_write_size = efa_max_eager_msg_size_with_largest_header(efa_domain) + 1;
fi_param_get_size_t(&efa_prov, "runt_size", &info->runt_size);
fi_param_get_size_t(&efa_prov, "inter_min_read_message_size", &info->min_read_msg_size);
Expand Down

0 comments on commit 0b75d02

Please sign in to comment.