-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fast-reboot] I2C failure after executed fast-reboot #267
Comments
Is the issue resolved? |
Usually, switch vendor will implement some control or monitor mechanism to handle their deivce via I2C, and it will generate I2C traffic periodically I do not quite understand who is using the i2c bus during the fast reboot. the above statement says switch vendors which is not clear to me. who? the asic? the platform driver? |
during the fast reboot after we did kexec, no user space program is running. during this period, who is using the i2c tree? |
@lguohan |
I don't think it would hurt to shut down all platform drivers before kexec to ensure that devices aren't left in a bad state. |
Update sonic-linux-kernel submodule to updated 202012 branch. This brings in the following commits.... ``` e97f9fc [202012] Add upstreamed patches which backport support for registers for CPLD PNs (sonic-net#275) 58abcdc Merge pull request sonic-net#267 from Staphylo/202012-log-buf-len 3f16f4f Merge pull request sonic-net#268 from Staphylo/202012-emmc-fixes a120ae7 Apply kernel patches to fix emmc unreliability 5f4a3f3 Increase log_buf_len to 1M for all architecture ```
Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Description
Sometime I2C topology will failure after executed "fast-reboot".
It cause by fast-reboot won't trigger release procedure (e.q: /ert/rc1~6.d/K*) like normal reboot, and boot into new kernel directly.
Usually, switch vendor will implement some control or monitor mechanism to handle their deivce via I2C, and it will generate I2C traffic periodically
It will make a situation about someone is using I2C during fast-reboot, then new kernel will initial I2C topology fail or some device under busy state.
Therefore, suggest to modify /usr/bin/fast-reboot to add either one as below:
(1) Trigger /ert/rc1~6.d/K* before reboot step of fast-reboot.
OR
(2) Trigger switch vendor's release procedure (/etc/init.d/platform-module-xxxxx stop) before reboot step of fast-reboot.
=> Usually, platform-module-xxxxx response for start/stop I2C topology / I2C relate driver.
Steps to reproduce the issue
In order to look pure behavior of SoNIC fast-reboot.
This reproduce procedure exclude switch vendor's module firstly.
Exclude switch vendor's initial script
=> Comment out context of "start" part of /etc/init.d/platform-module-xxxxx, or remove the script.
[Ex]
root@SONiC-Inventec-d7054:~# cat /etc/init.d/platform-modules-d7054q28b
...
start)
echo -n "Setting up board... "
depmod -a
# /usr/local/bin/inventec_d7054_util.py -f install <<<<<<< comment out here!
echo "done."
;;
...
reboot system to ensure no switch vendor's module
[Ex]
root@SONiC-Inventec-d7054:# reboot
Probe I2C modules and setup I2C topology manually.
[Ex]
root@SONiC-Inventec-d7054:# modprobe i2c-mux
root@SONiC-Inventec-d7054:# modprobe i2c-mux-pca954x
root@SONiC-Inventec-d7054:# modprobe i2c-dev
root@SONiC-Inventec-d7054:# echo pca9548 0x71 > /sys/bus/i2c/devices/i2c-0/new_device
Check I2C can be access
[Ex]
root@SONiC-Inventec-d7054:# ls /sys/bus/i2c/devices/
0-0071 i2c-0 i2c-1 i2c-2 i2c-3 i2c-4 i2c-5 i2c-6 i2c-7 i2c-8
root@SONiC-Inventec-d7054:#
root@SONiC-Inventec-d7054:# i2cget -y 3 0x20 0
0xff
root@SONiC-Inventec-d7054:# i2cget -y 6 0x20 0
0xff
Prepare a I2C stress script
[Ex]
root@SONiC-Inventec-d7054:# cat stress_i2c.sh
#!/bin/bash
while [ 1 ]
do
i2cget -y 3 0x20 0 > /dev/null
i2cget -y 6 0x20 0 > /dev/null
done
Execute stress script in background
[Ex]
root@SONiC-Inventec-d7054:# sh stress_i2c.sh &
[1] 2430
root@SONiC-Inventec-d7054:# sh stress_i2c.sh &
[2] 2699
root@SONiC-Inventec-d7054:# sh stress_i2c.sh &
[3] 3614
root@SONiC-Inventec-d7054:# sh stress_i2c.sh &
[4] 4400
root@SONiC-Inventec-d7054:# sh stress_i2c.sh &
[5] 5323
root@SONiC-Inventec-d7054:#
Execute fast-reboot
[Ex]
root@SONiC-Inventec-d7054:# fast-reboot
After fast-reboot, probe I2C modules and setup I2C topology manually.
=> The same with step-3
Get issues
[Ex]
root@SONiC-Inventec-d7054:# i2cget -y 3 0x20 0
[ 125.419639] i801_smbus 0000:00:1f.3: SMBus is busy, can't use it!
Error: Read failed
root@SONiC-Inventec-d7054:# i2cget -y 6 0x20 0
[ 131.274139] i801_smbus 0000:00:1f.3: SMBus is busy, can't use it!
Error: Read failed
root@SONiC-Inventec-d7054:#
Note:
=> Issues happen condition: someone is using I2C during fast-reboot
Describe the results you received
I2C can't be accessed after execute fast-reboot
root@SONiC-Inventec-d7054:# i2cget -y 3 0x20 0
[ 125.419639] i801_smbus 0000:00:1f.3: SMBus is busy, can't use it!
Error: Read failed
root@SONiC-Inventec-d7054:# i2cget -y 6 0x20 0
[ 131.274139] i801_smbus 0000:00:1f.3: SMBus is busy, can't use it!
Error: Read failed
root@SONiC-Inventec-d7054:#
Describe the results you expected
I2C should be accessible
root@SONiC-Inventec-d7054:# i2cget -y 3 0x20 0
0xff
root@SONiC-Inventec-d7054:# i2cget -y 6 0x20 0
0xff
Additional information you deem important (e.g. issue happens only occasionally)
(1) Usually, this issues issue happens occasionally.
That because of the issues condition is "someone is using I2C during fast-reboot"
(2) This issues will be solved If you are invoke "/etc/init.d/platform-module-xxxxx stop" before final step (execute reboot) of fast-reboot. or invoke /etc/rc6.d/Kxxxx is ok too.
Output of
show version
root@SONiC-Inventec-d7054:~# show version
SONiC Software Version: SONiC.HEAD.603-a917517
Distribution: Debian 8.10
Kernel: 3.16.0-5-amd64
Build commit: a917517
Build date: Sun May 27 07:05:24 UTC 2018
Built by: johnar@jenkins-worker-4
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-brcm HEAD.603-a917517 6ea2d437d2af 331.8 MB
docker-syncd-brcm latest 6ea2d437d2af 331.8 MB
docker-orchagent-brcm HEAD.603-a917517 bf49606a6932 252.6 MB
docker-orchagent-brcm latest bf49606a6932 252.6 MB
docker-lldp-sv2 HEAD.603-a917517 b1e74bb1fae3 265.8 MB
docker-lldp-sv2 latest b1e74bb1fae3 265.8 MB
docker-dhcp-relay HEAD.603-a917517 084ac6122760 249.1 MB
docker-dhcp-relay latest 084ac6122760 249.1 MB
docker-database HEAD.603-a917517 1891a9d3a27d 247.8 MB
docker-database latest 1891a9d3a27d 247.8 MB
docker-teamd HEAD.603-a917517 2f5e9cfa61cb 252.3 MB
docker-teamd latest 2f5e9cfa61cb 252.3 MB
docker-snmp-sv2 HEAD.603-a917517 9f02dc564068 286.7 MB
docker-snmp-sv2 latest 9f02dc564068 286.7 MB
docker-router-advertiser HEAD.603-a917517 ea28efd33902 245.4 MB
docker-router-advertiser latest ea28efd33902 245.4 MB
docker-platform-monitor HEAD.603-a917517 b02bc911d236 276.7 MB
docker-platform-monitor latest b02bc911d236 276.7 MB
docker-fpm-quagga HEAD.603-a917517 f543a3f6da39 259.1 MB
docker-fpm-quagga latest f543a3f6da39 259.1 MB
root@SONiC-Inventec-d7054:~#
PS:
This log was dump after execute the reproduce procedure.
Due to it comment out the switch vendor's initial script.
Therefore some service maybe not normally.
sonic_dump_SONiC-Inventec-d7054_20180613_131514.tar.gz
The text was updated successfully, but these errors were encountered: