Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default ECMP/LAG hash offset values for T0 and T1 #18912

Merged
merged 5 commits into from
May 21, 2024

Conversation

kperumalbfn
Copy link
Contributor

@kperumalbfn kperumalbfn commented May 8, 2024

Why I did it

Add default ECMP/LAG hash offset values.

Work item tracking
  • Microsoft ADO (number only): 25873808

How I did it

Update switch.json file in docker swss

How to verify it

Sonic-mgmt test with updated nexthops after applying hash_offset values. "ipfwd/test_nhop_group.py::test_nhop_group_member_order_capability"

Which release branch to backport (provide reason below if selected)

Tested branch (Please provide the tested image version)

master

Description for the changelog

To avoid ECMP polarization, hash offset values are set from orchagent.

Earlier, vendor SDK sets the hash_offset value internally with the same value as hash_seed. After introduction of SAI attributes(SAI_SWITCH_ATTR_ECMP_DEFAULT_HASH_OFFSET/SAI_SWITCH_ATTR_LAG_DEFAULT_HASH_OFFSET), those changes in SDK were removed causing imbalance of traffic in T1s.

This change will pass the ECMP/LAG HASH_OFFSET values from orchagent to SAI and set them in ASIC.

  • Different ECMP/LAG hash_offset values are set for T0 and T1 in swss switch.json(ecmp_hash_offset and lag_hash_offset). Values are different for T0 and T1 to avoid ECMP/LAG hash polarization.
  • As part of switch initialization, these values are passed to switch orchagent.
  • Switch orchagent process these values and call SAI switch API to set these values.

Will be merged after swss changes: sonic-net/sonic-swss#3138

sonic-mgmt - sonic-net/sonic-mgmt#12765

@kperumalbfn kperumalbfn requested a review from lguohan as a code owner May 8, 2024 17:48
@kperumalbfn kperumalbfn changed the title Kperumal/ecmp lag hash Add default ECMP/LAG hash offset values May 8, 2024
@liushilongbuaa
Copy link
Contributor

/azpw ms_conflict

@kperumalbfn
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kperumalbfn
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kperumalbfn
Copy link
Contributor Author

/azp run Azure.sonic-buildimage

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kperumalbfn
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kperumalbfn
Copy link
Contributor Author

/azpw ms_conflict

@yaqiangz
Copy link
Contributor

/azpw ms_conflict -f

@kperumalbfn
Copy link
Contributor Author

/azpw ms_conflict

@kperumalbfn
Copy link
Contributor Author

/azpw ms_conflict -f

2 similar comments
@yaqiangz
Copy link
Contributor

/azpw ms_conflict -f

@liushilongbuaa
Copy link
Contributor

/azpw ms_conflict -f

@kperumalbfn kperumalbfn changed the title Add default ECMP/LAG hash offset values Add default ECMP/LAG hash offset values for T0 and T1 May 21, 2024
@lguohan lguohan merged commit 518c3bc into sonic-net:master May 21, 2024
19 checks passed
@kperumalbfn kperumalbfn deleted the kperumal/ecmp_lag_hash branch May 21, 2024 19:17
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Jun 26, 2024
…net#18912)

To avoid ECMP polarization, hash offset values are set from orchagent.

Earlier, vendor SDK sets the hash_offset value internally with the same value as hash_seed. After introduction of SAI attributes(SAI_SWITCH_ATTR_ECMP_DEFAULT_HASH_OFFSET/SAI_SWITCH_ATTR_LAG_DEFAULT_HASH_OFFSET), those changes in SDK were removed causing imbalance of traffic in T1s.

This change will pass the ECMP/LAG HASH_OFFSET values from orchagent to SAI and set them in ASIC.

Different ECMP/LAG hash_offset values are set for T0 and T1 in swss switch.json(ecmp_hash_offset and lag_hash_offset). Values are different for T0 and T1 to avoid ECMP/LAG hash polarization.
As part of switch initialization, these values are passed to switch orchagent.
Switch orchagent process these values and call SAI switch API to set these values.

Will be merged after swss changes: sonic-net/sonic-swss#3138

sonic-mgmt - sonic-net/sonic-mgmt#12765

Microsoft ADO (number only): 25873808
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202311: #19397

yxieca pushed a commit that referenced this pull request Jun 26, 2024
#19397)

To avoid ECMP polarization, hash offset values are set from orchagent.

Earlier, vendor SDK sets the hash_offset value internally with the same value as hash_seed. After introduction of SAI attributes(SAI_SWITCH_ATTR_ECMP_DEFAULT_HASH_OFFSET/SAI_SWITCH_ATTR_LAG_DEFAULT_HASH_OFFSET), those changes in SDK were removed causing imbalance of traffic in T1s.

This change will pass the ECMP/LAG HASH_OFFSET values from orchagent to SAI and set them in ASIC.

Different ECMP/LAG hash_offset values are set for T0 and T1 in swss switch.json(ecmp_hash_offset and lag_hash_offset). Values are different for T0 and T1 to avoid ECMP/LAG hash polarization.
As part of switch initialization, these values are passed to switch orchagent.
Switch orchagent process these values and call SAI switch API to set these values.

Will be merged after swss changes: sonic-net/sonic-swss#3138

sonic-mgmt - sonic-net/sonic-mgmt#12765

Microsoft ADO (number only): 25873808

Co-authored-by: Kumaresh Perumal <kperumal@microsoft.com>
yxieca added a commit that referenced this pull request Jun 28, 2024
yxieca added a commit that referenced this pull request Jul 1, 2024
wangxin pushed a commit that referenced this pull request Sep 4, 2024
Why I did it
Fixes #19746

switch.json in swss docker sets the default hash offset to 0 for chassis-packet. The change was recently introduced via #18912 which was supposed to be only for T0/T1. On chassis-packet, each asics needs to set an unique hash offset to set an unique offset for load balancing algorithm. The new change overwrites platform setting and resets the hash-offset to 0 on all asics. This breaks the ecmp load balancing algorithm on chassis-packet.

How I did it
Remove the default hash-offset setting for chassis-packet. This setting is provided by platform based on platform specific rules.

How to verify it
Run sonic-mgmt fib/test_fib.py

Signed-off-by: anamehra <anamehra@cisco.com>
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Sep 4, 2024
Why I did it
Fixes sonic-net#19746

switch.json in swss docker sets the default hash offset to 0 for chassis-packet. The change was recently introduced via sonic-net#18912 which was supposed to be only for T0/T1. On chassis-packet, each asics needs to set an unique hash offset to set an unique offset for load balancing algorithm. The new change overwrites platform setting and resets the hash-offset to 0 on all asics. This breaks the ecmp load balancing algorithm on chassis-packet.

How I did it
Remove the default hash-offset setting for chassis-packet. This setting is provided by platform based on platform specific rules.

How to verify it
Run sonic-mgmt fib/test_fib.py

Signed-off-by: anamehra <anamehra@cisco.com>
mssonicbld pushed a commit that referenced this pull request Sep 5, 2024
Why I did it
Fixes #19746

switch.json in swss docker sets the default hash offset to 0 for chassis-packet. The change was recently introduced via #18912 which was supposed to be only for T0/T1. On chassis-packet, each asics needs to set an unique hash offset to set an unique offset for load balancing algorithm. The new change overwrites platform setting and resets the hash-offset to 0 on all asics. This breaks the ecmp load balancing algorithm on chassis-packet.

How I did it
Remove the default hash-offset setting for chassis-packet. This setting is provided by platform based on platform specific rules.

How to verify it
Run sonic-mgmt fib/test_fib.py

Signed-off-by: anamehra <anamehra@cisco.com>
vvolam pushed a commit to vvolam/sonic-buildimage that referenced this pull request Sep 12, 2024
Why I did it
Fixes sonic-net#19746

switch.json in swss docker sets the default hash offset to 0 for chassis-packet. The change was recently introduced via sonic-net#18912 which was supposed to be only for T0/T1. On chassis-packet, each asics needs to set an unique hash offset to set an unique offset for load balancing algorithm. The new change overwrites platform setting and resets the hash-offset to 0 on all asics. This breaks the ecmp load balancing algorithm on chassis-packet.

How I did it
Remove the default hash-offset setting for chassis-packet. This setting is provided by platform based on platform specific rules.

How to verify it
Run sonic-mgmt fib/test_fib.py

Signed-off-by: anamehra <anamehra@cisco.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants