-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ESP32S3] v4.4 WIFI losing connectivity temporarily or permanently without apparent reason (IDFGH-12977) #13212
Comments
@Espressif-liuuuu @zhangyanjiaoesp @nishanth-radja Do you have any finding in above log? |
@KonssnoK Are you using the |
@zhangyanjiaoesp yes the base is the ip_internal_network example. Also, how would you get the packets ? Wireshark connected to a sniffer? |
So @zhangyanjiaoesp i was able to generate one strange behavior, even if it's not exacly the one reported in this issue. With the same code (v4.4 top of c0e0af0 ) you can apply patches 1 2 3, which enable monitoring and pinging 03_ip_internal.patch by randomly detaching/attaching the layer 2 device i was able to reach this state, where the L2 device is never able to communicate with L1 I got an extract of L1 too (i would say MESH_EVENT_CHILD_CONNECTED to track L2 events) |
interestingly enough to recover the L2 device i had to reboot both devices, meaning rebooting only the L2 device was not solving the issue, and even rebooting the L1 device while L2 device was stuck (after reboot) did not solve the issue |
@zhangyanjiaoesp timeout_reset1.txt to recover the devices i had to keep them offline enough for the phone to lose the cache of connected devices ( pixel8 ) |
@KonssnoK please provide your sdkconfig file, and you are using PSRAM, right? |
sdkconfig.txt |
This log show the device didn't get the IP address, which cause the ping timeout.
And this log show the device can't connect to the router, the reason is auth timeout. I have tested using the router, and can't reproduce this issue. I will use the mobile hostspot to test again, can you provide the model of your phone? Or any phone can reproduce this issue? @KonssnoK |
@zhangyanjiaoesp i reproduced with a Google Pixel8. not getting the IP - strange, would mean the IP service is stuck 🤔 |
@zhangyanjiaoesp i moved to 3 devices and trying to replicate but for now without success.. |
and as soon as i wrote that, something strange happened again: dev2.txt
(devices are 30cm apart fom each other) it seems it goes on forever |
one hour in: |
|
this is instead the log of device1 getting stuck and not trying to connect to the mesh anymore |
@KonssnoK In the log, I see that initially communication among the three devices was normal, and then you restarted device2 and device3? And I have using the Google Pixel5 mobile to test, but I didn't reproduce the problem. |
@zhangyanjiaoesp |
Ok, I will try to restart the device and test again. |
@zhangyanjiaoesp in my experience slow data rates help achieving the issues. |
please note that the logs are more or less synchronized at the end, not the start! (i extract them more or less at the same time) 240619dev3.txt @zhangyanjiaoesp i rebooted the root device and it went offline without managing to reconnect. After a while device 3 managed to change status and directly connect as the root. 240619dev3_2.txt device 2 at some point manages to recover too. 240619dev3_3.txt device one is still disconnected and not able to recover instead. phone connected in 5G with rate limiter at 128kbps device 1 dodes not recover |
@zhangyanjiaoesp for reference this setup seems to trigger the issue in the above message quite often. |
@zhangyanjiaoesp i tried also today to replicate, to verify if this is consistent:
it's quite easy to create issues in this configuration, please let me know if you manage. 240620dev3.txt after a while dev 3 recovers and then also dev 2. dev 1 is stuck. |
@KonssnoK I'm sorry, I have an urgent task recently. I will test your issue next week. |
@zhangyanjiaoesp sure, i'll concentrate on another issue meanwhile |
@KonssnoK I have reproduced this issue by rebooting the root device, and I have found the root cause, the following wifi libs can solve the problem. Please replace the wifi libs and test again. For the other issues, I still can't reproduce them although I randomly reboot the device2/3. |
@zhangyanjiaoesp i think i found the underlying issue. the function in the example
works only if the same interface is used over and over. But instead
will generate a new interface each time the device has to start as root. This is clearly visible from the logs above, where you see I leave you to fix this in your example. How is the propagation of the mesh fixes to 4.4 going ? when can we expect a push to 4.4? Thanks. |
@KonssnoK Can you provide the patch of this log ? I need to know what are the code corresponding to logs. |
@zhangyanjiaoesp see my last message, i don't think those logs are necessary anymore, i'm testing out a small change that should fix the DHCPS issue
|
@KonssnoK The fix LGTM.
For this issue, my colleague has contacted your colleague Lorenzo, and communicated with him about the solution of providing fixes. You can confirm with him. |
@zhangyanjiaoesp the small fixes i put above are not complete
should also contain the same
|
@zhangyanjiaoesp mmm, for now we got a shadow repository of v4.4.8 esp-idf on gitlab, but the problem is that the fix you did is on esp_wifi submodule, so the current repo is not enough. I already wrote to Caijin. |
@KonssnoK we will push the fixes based on the v4.4.8 esp-idf, and then update it to you. The branch will be provided tomorrow. |
hello @zhangyanjiaoesp ,
the way i triggered this is simply to let devices start without wifi, and waiting for the LTE device to connect to the LTE. how did this issue triggers in the code that you fixed? |
@KonssnoK By the way, since this ticket has been closed and there are too many comments under it, can you create a new ticket to trace this new issue? |
@zhangyanjiaoesp i'm now transitioning to the lib pushed on gitlab, i will retry to trigger the issues and in case open the new ticket.. it seems quite dificult to be reproduced. |
@zhangyanjiaoesp was the fix for the deauth router pushed in our 4.4.8 branch? thanks! |
Yes, the branch contain 3 changes:
|
…rt multiple times Closes #13212
…rt multiple times Closes #13212
…rt multiple times Closes #13212
…rt multiple times Closes #13212
…rt multiple times Closes #13212
Answers checklist.
General issue report
@zhangyanjiaoesp
based on 27ec26d
To reproduce
Connect to mobile hostspot wifi
Put phone simcard to 2G.
Wait
The issue happens in the ROOT. Once the root is affected, the issue is propagated to all children.
The issue is either temporary or permanent. When permanent, the only way to recover is to reboot.
espressif_wifi_dump_3.txt
espressif_wifi_dump.txt
espressif_wifi_dump_2.txt
Other related issues with similar behavior:
#8953
#10506
The text was updated successfully, but these errors were encountered: