Skip to content

Commit

Permalink
Fix flakiness of qos/test_qos_dscp_mapping.py (#13496)
Browse files Browse the repository at this point in the history
What is the motivation for this PR?
In some of the cases, after sending packets (2000) the queue counter value is not reflecting correct value (counter value is less than expected) and thus lead to test failure. In the issue state, reading the counter value (in breakpoint) again shows the correct value.

Sample output in failure case:

 3238         "  Inner Packet DSCP Value    Expected Egress Queue    Egress Queue Count  Result                              Actual Egress Queue\n",
 3239         "-------------------------  -----------------------  --------------------  --------------------------------  ---------------------\n",
 3240         "                        0                        1                   865  FAILURE - DUT POLL FAILURE                           -1\n",
 3241         "                        1                        1                  2000  SUCCESS                                               1\n",
 3242         "                        2                        1                  1022  FAILURE - DUT POLL FAILURE                           -1\n",
 3243         "                        3                        3                  2000  SUCCESS                                               3\n",
 3244         "                        4                        4                  1396  FAILURE - DUT POLL FAILURE                           -1\n",
 3245         "                        5                        1                  2000  SUCCESS                                               1\n",
 3246         "                        6                        1                  1449  FAILURE - DUT POLL FAILURE                           -1\n",
 3247         "                        7                        1                  2000  SUCCESS                                               1\n",
 3248         "                        8                        0                  1909  FAILURE - INCORRECT PACKET COUNT                      0\n",
 3249         "                        9                        1                  2000  SUCCESS                                               1\n",
...
How did you do it?
Updated the test to wait for atleast 10s (which is hardware counter polling time) before reading the queue counters.
$ counterpoll show | grep -i queue_stat
QUEUE_STAT                  default (10000)     enable
Updated the logic to re-poll the counters if the egress packet count is not as expected.
How did you verify/test it?
Stressed the test with fix on Arista-7260CX3-D108C8. Test is passing consistently with the fix.
  • Loading branch information
vkjammala-arista authored Jul 15, 2024
1 parent fcb9eb3 commit beeeaa5
Showing 1 changed file with 4 additions and 11 deletions.
15 changes: 4 additions & 11 deletions tests/qos/test_qos_dscp_mapping.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from tests.common.helpers.ptf_tests_helper import downstream_links, upstream_links, select_random_link,\
get_stream_ptf_ports, get_dut_pair_port_from_ptf_port, apply_dscp_cfg_setup, apply_dscp_cfg_teardown # noqa F401
from tests.common.utilities import get_ipv4_loopback_ip, get_dscp_to_queue_value, find_egress_queue,\
get_egress_queue_pkt_count_all_prio
get_egress_queue_pkt_count_all_prio, wait_until
from tests.common.helpers.assertions import pytest_assert
from tests.common.fixtures.duthost_utils import dut_qos_maps_module # noqa F401

Expand Down Expand Up @@ -127,8 +127,6 @@ def send_and_verify_traffic(ptfadapter,
logger.info("Received packet(s) on port {}".format(ptf_dst_port_ids[port_index]))
global packet_egressed_success
packet_egressed_success = True
# Wait for packets to be processed by the DUT
time.sleep(8)
return ptf_dst_port_ids[port_index]

except AssertionError as detail:
Expand Down Expand Up @@ -280,15 +278,10 @@ def _run_test(self,
if packet_egressed_success:
dut_egress_port = get_dut_pair_port_from_ptf_port(duthost, tbinfo, dst_ptf_port_id)
pytest_assert(dut_egress_port, "No egress port on DUT found for ptf port {}".format(dst_ptf_port_id))
# Wait for the queue counters to be populated.
verification_success = wait_until(60, 2, 0, lambda: find_queue_count_and_value(duthost,
queue_val, dut_egress_port)[0] >= DEFAULT_PKT_COUNT)
egress_queue_count, egress_queue_val = find_queue_count_and_value(duthost, queue_val, dut_egress_port)
# Re-poll DUT if queue value could not be accurately found
if egress_queue_val == -1:
time.sleep(2)
egress_queue_count, egress_queue_val = find_queue_count_and_value(duthost, queue_val,
dut_egress_port)
# Due to protocol packets, egress_queue_count can be greater than expected count.
verification_success = egress_queue_count >= DEFAULT_PKT_COUNT

if verification_success:
logger.info("SUCCESS: Received expected number of packets on queue {}".format(queue_val))
output_table.append([rotating_dscp, queue_val, egress_queue_count, "SUCCESS", queue_val])
Expand Down

0 comments on commit beeeaa5

Please sign in to comment.