Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated TS and auto-TS to collect orch abrt saisdkdump files #2540

Closed
wants to merge 8 commits into from

Conversation

vivekrnv
Copy link
Contributor

@vivekrnv vivekrnv commented Dec 6, 2022

Signed-off-by: Vivek Reddy Karri vkarri@nvidia.com

What I did

  • Update auto-techsupport to wait for lock file creation which signifies that the saisdkdump is collected by the host before stopping syncd.
  • Enhance techsupport:
    1. Add an ability the collect the saisdkdump files (if present under /var/log/orch_abrt_sdkdump/)
    2. collect_mellanox is updated to check if the syncd container status and create_switch call is successful (by checking if the PortInitDone is populated in APPL_DB) before invoking saisdkdump under syncd.

How I did it

How to verify it

Simulate a SAI programming failure

root@r-panther-13:/home/admin# sonic-db-cli STATE_DB SET ORCH_ABRT_STATUS 1
root@r-panther-13:/home/admin# docker exec -it swss kill -6 47  (SIGABRT orchagent, which creates a coredump and triggers auto-ts)

LOGS

Nov 17 01:34:07.837066 r-panther-13 INFO systemd[1]: Stopping syncd service...
Nov 17 01:42:49.083737 r-panther-13 NOTICE root: sai_sdk_dump_1668642162.tar.gz collected before stopping syncd

Nov 17 01:42:50.304222 r-panther-13 INFO coredump_gen_handler.py[233582]: Waited until the saisdkdump is collected, proceeding forward..
Nov 17 01:44:18.572194 r-panther-13 INFO coredump_gen_handler.py[233582]: show techsupport --silent --global-timeout 60 --since 2 days ago is successful, sonic_dump_r-panther-13_20221117_014251.tar.gz is created

DUMP

root@r-panther-13:/home/admin/sonic_dump_r-panther-13_20221117_014251# ls -l orch_abrt_sdkdump/
total 5148
-rw-r--r-- 1 root root 1756648 Nov 17 01:44 sai_sdk_dump_1668638456.tar.gz -> ../log/sai_sdk_dump_1668638456.tar.gz
-rw-r--r-- 1 root root 1756954 Nov 17 01:44 sai_sdk_dump_1668641647.tar.gz -> ../log/sai_sdk_dump_1668641647.tar.gz
-rw-r--r-- 1 root root 1753301 Nov 17 01:44 sai_sdk_dump_1668642162.tar.gz -> ../log/sai_sdk_dump_1668642162.tar.gz

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
…e times

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
@vivekrnv vivekrnv closed this Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants