-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Which kernel process crashed or hangs? #1777
Comments
Nothing stands out from dmesg log other than the CPU stalls. |
Was editing files. No load on the system. This kind of crash occured the first time. top output after noticing the hang:
ps ax output after noticing the hang:
df -h
Firmware is from the latest raspbian release:
Kernel is still an older one, 1000Hz timer frequency and dynamic ticks disabled (The latest 4.4.32 has issues with scheduling or whatever, it causes packtloss on the wifi cards)
CPU underclocked, rest overclocked, but not too much:
A wifi stick (that wasn't connected to anything and no network-manager or whatever running that does stuff with it), HDMI monitor and ethernet cable. Powered via official Pi power supply. vcgencmd get_throttled never returns anything other than 0x0. |
Can you remove the overclock and disconnect the wifi stick. Confim if issue still occurs. |
You really need to confirm the problem occurs with a kernel we have provided. If you click on each commit the end of the url contains a git hash. Run |
P33M told me it would help with USB flakiness, see this issue: #1763
Oh, didn't notice that, thanks. I'm using /dev/serial0 to exchange data with an external system (telemetry from flight control) and it works, I thought bluetooth would be there without that overlay? (Maybe I confuse something, if I remember correctly this was changed several times because of some issues).
This would take months to clearly reproduce and attribute to a certain kernel. During doing that, I would find other bugs, maybe unrelated, maybe not, which would consume even more time to analyze. Until then, you will have bumped the kernel and firmware another 10-20 times, and when I open an issue then, you'll probably tell me to test with the latest stock kernel (understandable, ofcourse). And if it doesn't happen then, try to exactly find the one where the problem started. Hmm, back to step one. Assuming, that the problem is then fixed in that future kernel (maybe it is already in 4.4.32 ...): Great, I'll start using it, and after some time, (during which another 10 kernels will be released) I'll find another issue (just like I just did ...). Report it here, and the story begins again. See the issue with that approach? I thought having a kernel bug occuring, having dmesg and other logs plus having the luck that the system was still usable to do further debugging would actually be a great chance to atleast narrow it down some. |
Sometimes the log gives a clear indication of where the problem is. Interestingly when searching for "rcu_sched self-detected stall on CPU" and raspberry pi I found #1253 which turned out to be overclock related, so confirming the issue still occurs without overclock seems like a good idea. |
??? "rcu_sched self-detected stall on CPU" simply means that a stall has been detected. There can be a lot of things that cause this message to appear. That guy in the issue 1253 had the RAM overclocked to 500MHz and the dmesg output says "Unable to handle kernel paging request at virtual address fffffd48" Well, whatever, problem occured again, this time not overclocked and I have the start of the dmesg output. There was again that 25% si load seen in top. As probably nobody is going to look into this anyway, issue closed. Maybe the logs below help somebody else trying to figure out what's going on.
|
just happened again. it's always "swapper" top again shows 25% si:
|
This time it crashed right while plugging in a USB keyboard:
|
Oh, another one. This time I was using an USB Joystick:
|
Another one:
|
Crash galore continued:
|
This thing is so broken. Plugged in a USB keyboard, now USB is not functional anymore:
|
This time, I just plugged a USB memory stick. First time it's crapping out:
Second time it works:
|
What does |
I didn't check that particilar time, but whenever I check, it´s as always: "throttled=0x0". It's not a power or temp issue. I have double and triple checked that. Tried official Pi power supply, and also a 7.5A 5.2V supply plus an extra Low-ESR cap wired directly to the GPIO pins. BTW: I have had a script running doing that vcgencmd get_throttled check periodically every 30 seconds or so, this caused other issues, strange VCHQI kernel messages etc. |
Not sure how long this has been going on, was editing some files, wanted to reboot, noticed reboot command didn't react, looked into top and dmesg and see this:
The system is still running, how can I identify this further?
The text was updated successfully, but these errors were encountered: