-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disconnected two wifi cards, interfaces still there, still packets coming in #1763
Comments
Well, after staring at it totally bewildered for some more time (just to repeat: there are packets coming in on interfaces that have been physically disconnected more than 10 minutes ago!), suddenly, it realized that the USB sticks are gone:
The kworker thread is also gone, but now hello_video.bin for whatever strange reason uses 30% cpu (usually, it only needs around 10-15%)
What's also strange is, that one of the three remaining card has received a lot less packets (2.3GB vs. 900MB):
|
Those aren't packets - the URBs being submitted are returning failure but the driver isn't acting on the fact that the device is no longer there. I would wager that there's some bug in updating the stats counting despite the USB transfer failing. This is a case of the Ralink driver doing a bad thing. On disconnect the RX polling thread isn't getting shut down properly. |
The rx application (which fetches incoming packets from the interface via pcap) also still showed incoming packets 10 minutes later. That application counts packets after it has succesfully identified those packets as "destined to me", with a special fake mac address. This wasn't just wrong counters I think. It was either packetsbeing stuck in some buffer (for that long? maybe because the load was too high) or those packets were actually coming from other cards and ended up in the data structures of the disconnected cards. Which would be kinda scary. I hope it was just stuck data ... |
Or an alternative explanation - the USB link is flakey in some fashion but enough data is getting through that the device only gets disconnected occasionally.
|
Yeah, USB is indeed flaky on the Pi. The issue I posted is not the only one, it was just the most bewildering one so far. To answer your questions:
|
Isolating flakiness to just the Pi plus 3-4 peripherals is optimal, because it means I can have a hope of reproducing the issue without speculating about issues arising from PSU/dodgy cables/induced EMI from living next to a steel foundry/phase of the moon. One thing to try: In /boot/config.txt, add the parameter As you can get "flakiness" with just Atheros wifi dongles, do you have dmesg logs from where these dongles are exhibiting "flakiness"? Where can I buy these wifi devices - preferably the exact same model number as the ones in your possession? Edit: regarding packets appearing from phantom devices that have been removed, that's almost certainly a software bug (and potentially different from Atheros device flakiness) but won't be reproducible unless we have the same hardware exhibiting the same behaviour. |
Thanks, I'll give that usb_mdio parameter a try. The Atheros sticks are TP-Link TL-WN722N and Alfa AWUS036NHA with AR9271 chipset, they can be found almost everywhere. The Ralink sticks are from CSL and are called "CSL 300 Mbit stick" (with Ralink RT5572 chipset). Not sure if they deliver outside of Germany though: You can find similar sticks on Aliexpress: Regarding dmesg logs: Not too much unfortunately, the problem is that often usb keyboard and ethernet don't work anymore, and I cannot see the console because the video is on a layer in front of it. Serial console is also not accessible because the serialport is used for telemetry data. Still need to find some time to setup a testbed with serial console. Have already opened an issue with the ath9k-htc firmware project, there you can find some logs and USB packet captures. Shall I enable any kernel debug/tracing features beforehand? Edit: Oh, in case you're wondering why so many wifi sticks :) I'm in the process of building an antenna tracker/relay station for EZ-Wifibroadcast. Three sticks are for the 6 directional antennas, one stick for two additional omni antennas (for close range flying and high above the tracker) and another one for relaying the video from the antenna tracker to the video goggles. https://www.youtube.com/watch?v=dOKFY1A7Wxg |
@rodizio1 Can this be closed? |
Sorry for the late reply, somehow overlooked your comment. The issues are still existent. Still hoping that P33M gets some Ralink and Atheros dongles and looks into it :)
Have tried this in the meantime, cannot tell a difference. |
@P33M Was there any progress on this? |
To give an update. There is somebody who has exactly the same issues with Atheros sticks on the Pi, but not on other hardware. See this issue on the ath9k-htc-firmware github page: qca/open-ath9k-htc-firmware#114 Edit: Oh, and somebody else seems to have the same problem also here: #2023 This issue is really easy to reproduce. Get 4 Atheros AR9271 based wifi sticks, plug them in. If you're lucky, you get three running, usually you see all kinds of weird things happening after you plugged the third (also doesn't matter if they are already connected at boot-up). Can be reproduced on all Raspberry Pi versions with all Raspbian versions from (atleast) 2015-12-31 up the the latest. Another way to make them crash is to put them into promiscous mode, they'll crash in about 1-2 out of 10 times doing that (other people reported the same). Three Atheros sticks seem to work kind of stable now (that is, after I made sure that nothing else is accessing the usb bus when the sticks are initialized or ifconfig/iwconfig commands are being run, insert a delay between the iwconfig commands and do not put them into promiscous mode ...) With Ralink sticks you need atleast 6 to make it behave strange and crash. Mixing Ralink and Atheros works, but only up to two Atheros and two Ralink sticks, anything above that and it crashes and does strange things. With AR9287 sticks it's even worse, only one works reliably. |
This may be related to the failure mode in #2023 - Atheros dongles specify interrupt endpoints with a period of 1 microframe. For reasons explained in #2023, this causes starvation when servicing other endpoints. Device disconnects are only detected by the hub, which then raises a port status change flag in its status endpoint (an interrupt endpoint). If the port status is never read, the host doesn't know the device has disappeared. Please retest this with latest rpi-update and cmdline.txt parameter specified in the linked issue. |
@rodizio1 can you test with the latest kernel please, and report back any issues. Failing that, this issue will be closed within 30 days unless further interactions are posted. If you wish this issue to remain open, please add a comment. A closed issue may be reopened if requested. |
I haven't seen the initial issue (i.e. still packets coming in from nirvana) again. However, the other issues (see the last post from me in this thread)/still exist. Absolute maximum is 3 Atheros cards, or 5 Ralink cards, (or 2 Atheros and 2 Ralink), otherwise things get unstable. This also happens with Kernel 4.9.35. Have not yet tested 4.14.50 though. |
I have a similar issue with 2 RT5370 units and a SimCOM 7100A attached. Tested with a powered USB hub and without. [ +0.120031] ieee80211 phy3: rt2x00usb_vendor_request: Error - Vendor Request 0x07 failed for offset 0x1328 with error -110 |
The same
|
Had 5 Ralink rt2800 usb wifi cards in monitor mode receiving packets. Disconnected two of them, now the interfaces are still there and there are still packets coming in on those interfaces?
Using the latest Raspbian 4.4.32 kernel and firmware on a Pi3.
This is what dmesg said after disconnecting the wifi sticks:
ifconfig after disconnecting the cards:
ifconfig about 10 minutes later:
dmesg about 10 minutes later (those messages continue, I assume because that kworker thread is eating 100% cpu?)
The text was updated successfully, but these errors were encountered: