-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[reboot] User-friendly reboot cause message for kernel panic #1486
Conversation
panicked. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
@sujinmkang: We want to rephrase this message to be more user-friendly. However, we may need to either adjust the syntax here or in the |
@jleveque does this PR also wants to add the detail kernel panic information or just inform the kernel crash as last reboot cause? |
Just inform that kernel panic was the last reboot cause. |
@jleveque If the kernel panic happens after passing this line, I mean, if it happens during reboot, then there is still having a chance to miss the kernel panic? I think it will be good to have the core file directory or file name which is related to the kernel panic. We can add it from process-reboot-cause. |
I see your point, but if a kernel panic occurs during reboot, it will be difficult (possibly impossible, e.g., if the filesystem is read-only) to leave a breadcrumb that we can use after booting back up to determine that a kernel panic occurred. Also, technically speaking, if a user issues one of the |
https://github.com/Azure/sonic-buildimage/blob/28cb43cb42e3223cade2efa9a5f60542d97a89e7/src/sonic-host-services/scripts/determine-reboot-cause#L129 |
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
`read_reboot_cause_file()`. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
show/reboot_cause.py
Outdated
reboot_cause_str = "Cause: {}".format(reboot_cause["cause"]) | ||
|
||
if "user" in reboot_cause.keys() and reboot_cause["user"] != "N/A": | ||
reboot_cause_str += ", User: {}".format(reboot_cause["user"]) | ||
|
||
if "time" in reboot_cause.keys() and reboot_cause["time"] != "N/A": | ||
reboot_cause_str += ", Time: {}".format(reboot_cause["time"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for "Cause:` prefix. Also, we lose the "User issued '' command" syntax here. Instead we just see the command.
Suggest keeping format the same as before.
Something like the following:
reboot_cause_dict = read_reboot_cause_file()
reboot_cause = reboot_cause_dict .get("cause", "Unknown")
reboot_user = reboot_cause_dict .get("user", "N/A")
reboot_time = reboot_cause_dict .get("time", "N/A")
if reboot_user != "N/A":
reboot_cause_str = "User issued '<command>' command".format(reboot_cause)
else:
reboot_cause_str = reboot_cause
if reboot_user != "N/A" or reboot_time != "N/A":
reboot_cause_str += " ["
if reboot_user != "N/A":
reboot_cause_str += "User: {}".format(reboot_user )
if reboot_time != "N/A":
reboot_cause_str += ", "
if reboot_time != "N/A":
reboot_cause_str += "Time: {}".format(reboot_time)
reboot_cause_str += "]"
format of message. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
This comment has been minimized.
This comment has been minimized.
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
when previous reboot file can not be read successfully. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Retest this please. |
Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Signed-off-by: Yong Zhao yozhao@microsoft.com What I did If the rebooting of SONiC device was caused by kernel panic, then the CLI command show reboot-cause should show Kernel Panic. How I did it Currently if kernel was panicked, then the device would be rebooted. The reboot script wrote a message into reboot-cause.txt. I just updated the content of this message. How to verify it I verified this change on the virtual switch in the following steps: Trigger kernel panic: echo c > /proc/sysrq-trigger After device was rebooted, run the CLI show reboot-cause: admin@vlab-01:~$ show reboot-cause Kernel Panic [Time: Tue 09 Mar 2021 03:03:56 AM UTC] Previous command output (if the output of a command-line utility has changed) admin@vlab-01:~$ show reboot-cause User issued 'kdump' command [User: kdump, Time: Mon 08 Mar 2021 01:47:43 AM UTC] New command output (if the output of a command-line utility has changed) admin@vlab-01:~$ show reboot-cause Kernel Panic [Time: Tue 09 Mar 2021 03:03:56 AM UTC]
…et#7153) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
…et#7153) Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Signed-off-by: Yong Zhao yozhao@microsoft.com Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Signed-off-by: Yong Zhao yozhao@microsoft.com
What I did
If the rebooting of SONiC device was caused by kernel panic, then the CLI command
show reboot-cause
should showKernel Panic
.How I did it
Currently if kernel was panicked, then the device would be rebooted. The
reboot
script wrote a message intoreboot-cause.txt
. I just updated the content of this message.How to verify it
I verified this change on the virtual switch in the following steps:
echo c > /proc/sysrq-trigger
show reboot-cause
:admin@vlab-01:~$ show reboot-cause
Kernel Panic [Time: Tue 09 Mar 2021 03:03:56 AM UTC]
Previous command output (if the output of a command-line utility has changed)
admin@vlab-01:~$ show reboot-cause
User issued 'kdump' command [User: kdump, Time: Mon 08 Mar 2021 01:47:43 AM UTC]
New command output (if the output of a command-line utility has changed)
admin@vlab-01:~$ show reboot-cause
Kernel Panic [Time: Tue 09 Mar 2021 03:03:56 AM UTC]