Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ping too frequently results in Safe Mode #5980

Closed
anecdata opened this issue Feb 3, 2022 · 11 comments · Fixed by #7938
Closed

ping too frequently results in Safe Mode #5980

anecdata opened this issue Feb 3, 2022 · 11 comments · Fixed by #7938
Labels
bug espressif applies to multiple Espressif chips network
Milestone

Comments

@anecdata
Copy link
Member

anecdata commented Feb 3, 2022

CircuitPython version

Adafruit CircuitPython 7.2.0-alpha.1-224-gac7a80753 on 2022-01-26; Adafruit QT Py ESP32S2 with ESP32S2
Adafruit CircuitPython 7.2.0-alpha.1-224-gac7a80753 on 2022-01-26; Saola 1 w/Wrover with ESP32S2

Code/REPL

import time
import wifi
from secrets import secrets

DELAY = 0.5  # this is fine, but 0-0.4 or so lead to Safe Mode after a couple of pings

wifi.radio.connect(secrets['ssid'], secrets['password'])
while True:
    print(f"LAN ping: {wifi.radio.ping(wifi.radio.ipv4_gateway)}s")
    time.sleep(DELAY)

Behavior

code.py output:
LAN ping: 0.009s
LAN ping: 0.237s
LAN ping: 0.004s
LAN ping: 0.002s

[tio 10:38:58] Disconnected
[tio 10:39:00] Connected
Running in safe mode! Not running saved code.

You are in safe mode because:
CircuitPython core code crashed hard. Whoops!
Crash into the HardFault_Handler.
Please file an issue with the contents of your CIRCUITPY drive at
https://github.com/adafruit/circuitpython/issues

Press any key to enter the REPL. Use CTRL-D to reload.

Description

It's a little more resilient if the same IP address isn't pinged repeatedly:

import time
import wifi
import ipaddress
from secrets import secrets

DELAY = 0.1  # Safe Mode after about 10 pings

wifi.radio.connect(secrets['ssid'], secrets['password'])
for _ in range(0, 256):
    ipv4 = ipaddress.ip_address(".".join((repr(wifi.radio.ipv4_gateway).rpartition(".")[0], str(_))))
    print(f"LAN ping: {str(ipv4):15} {wifi.radio.ping(ipv4)} s")
    time.sleep(DELAY)

Additional information

I think this has been an issue since the beginning of wifi iirc. Not a showstopper, just don't ping too fast.

@anecdata anecdata added the bug label Feb 3, 2022
@tannewt tannewt added espressif applies to multiple Espressif chips network labels Feb 3, 2022
@tannewt tannewt added this to the Long term milestone Feb 3, 2022
@tannewt
Copy link
Member

tannewt commented Feb 3, 2022

A debug trace from the IDF would be super helpful. That should show how we get into safe mode.

@DavePutz
Copy link
Collaborator

DavePutz commented Feb 8, 2022

The faster rate of pings will cause the remote side to not respond due to ICMP rate-limiting. This will cause a ping timeout,
but it looks like the CP code to detect timeouts will not work; testing shows ESP_PING_PROF_DURATION only returns non-zero when a reply has been received. I thought perhaps the lack of ping callbacks might be the issue; but adding them did not correct the issue.

@BlueBackbite
Copy link

Just as a side note to #5745, I ran the pings with 10-minute delays and the boards I've tested would still crash after about 30 pings. As if the pings were eating memory somewhere.

@anecdata
Copy link
Member Author

Adafruit CircuitPython 7.2.0-alpha.2-23-gd4c2ffea2-dirty on 2022-02-14; ESP32-S2-DevKitC-1-N4R2 with ESP32S2 (should be equivalent to https://adafruit-circuit-python.s3.amazonaws.com/bin/espressif_esp32s2_devkitc_1_n4r2/en_US/adafruit-circuitpython-espressif_esp32s2_devkitc_1_n4r2-en_US-20220214-d4c2ffe.bin)

W (9514) wifi: got ip
E (11164) ping_sock: esp_ping_new_session(270): create socket failed: -1
Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x40114aba  PS      : 0x00060f30  A0      : 0x800b8fec  A1      : 0x3ffdff90  
A2      : 0x00060023  A3      : 0xffffffff  A4      : 0x3ffdfff8  A5      : 0x3fff0688  
A6      : 0x00000080  A7      : 0x7ffa0000  A8      : 0x80114886  A9      : 0x3ffdff60  
A10     : 0x3fff0688  A11     : 0x3ff9f31c  A12     : 0x00000001  A13     : 0x00000001  
A14     : 0x3f031b8e  A15     : 0x3f031e7f  SAR     : 0x00000015  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00060087  LBEG    : 0x00000001  LEND    : 0x00000001  LCOUNT  : 0x4002cc51  

Backtrace:0x40114ab7:0x3ffdff900x400b8fe9:0x3ffdffc0 0x400b2d11:0x3ffe0030 0x40094746:0x3ffe0060 0x4008f9f2:0x3ffe0090 0x4008fb05:0x3ffe00b0 0x4009e0df:0x3ffe00d0 0x400947da:0x3ffe0170 0x4008f9f2:0x3ffe01a0 0x4008fa1e:0x3ffe01c0 0x400d6f1f:0x3ffe01e0 0x400d7296:0x3ffe0280 0x400a1aac:0x3ffe02a0 0x400a1e01:0x3ffe02c0 0x400a21fe:0x3ffe0330 0x400a254b:0x3ffe0360 0x401736e5:0x3ffe0380 

ELF file SHA256: 595260a3b1093040

CPU halted.

@tannewt
Copy link
Member

tannewt commented Feb 15, 2022

@anecdata Please use https://github.com/adafruit/circuitpython/blob/main/ports/espressif/tools/decode_backtrace.py to decode the backtrace. Run it with the board name python tools/decode_backtrace.py ESP32-S2-DevKitC-1-N4R2 and then copy and paste the whole line starting with Backtrace:.

@anecdata
Copy link
Member Author

anecdata commented Feb 15, 2022

That's a cool tool. Seems to be a problem with an error return from the lwip component's ping. Why is there no common-hal line in the trace?

espressif/build-espressif_esp32s2_devkitc_1_n4r2/esp-idf/../../esp-idf/components/lwip/apps/ping/ping_sock.c:347 (discriminator 5)
espressif/../../shared-bindings/wifi/Radio.c:536
espressif/../../py/objfun.c:136
espressif/../../py/runtime.c:656
espressif/../../py/runtime.c:671
espressif/../../py/vm.c:1102
espressif/../../py/objfun.c:297 (discriminator 4)
espressif/../../py/runtime.c:656
espressif/../../py/runtime.c:629
espressif/../../shared/runtime/pyexec.c:146
espressif/../../shared/runtime/pyexec.c:743
espressif/../../main.c:223
espressif/../../main.c:375
espressif/../../main.c:876
espressif/supervisor/port.c:403
espressif/build-espressif_esp32s2_devkitc_1_n4r2/esp-idf/../../esp-idf/components/freertos/port/port_common.c:129

@tannewt
Copy link
Member

tannewt commented Feb 15, 2022

Is this an opt build? The compiler could have brought the underlying call in common-hal directly into the shared bindings bit. The LoadProhibited error is usually due to an invalid memory address being passed.

@anecdata
Copy link
Member Author

anecdata commented Feb 15, 2022

I don't know what an opt build is. This was built from tip of main yesterday with:
make -j clean all BOARD=espressif_esp32s2_devkitc_1_n4r2 DEBUG=1

@tannewt
Copy link
Member

tannewt commented Feb 16, 2022

opt is without the DEBUG=1. I think the compiler inlines for debug still actually. The unoptimized builds don't fit in the flash space we've set aside.

@todbot
Copy link

todbot commented Sep 28, 2022

Ran into this issue recently and saw it on both 7.3.3 and 8.0.0.beta0 on both ESP32-S2 and ESP32-S3.
Specifically the versions and hardware were:

Adafruit CircuitPython 8.0.0-beta.0 on 2022-08-18; Adafruit QT Py ESP32-S3 no psram with ESP32S3
Adafruit CircuitPython 8.0.0-beta.0 on 2022-08-18; S2Mini with ESP32S2-S2FN4R2
Adafruit CircuitPython 7.3.3 on 2022-08-29; S2Mini with ESP32S2-S2FN4R2
Adafruit CircuitPython 7.3.0-rc.1 on 2022-05-18; Adafruit QT Py ESP32-S3 no psram with ESP32S3

See #6955 for more details (it's a dupe before I knew of this issue)

@anecdata
Copy link
Member Author

Thanks, @todbot

I couldn't find any related open issues in the Espressif esp-idf, though there was an older closed issue regarding ping crashes.

jepler added a commit to jepler/circuitpython that referenced this issue May 5, 2023
esp_ping_new_session can fail, particularly if ping is called quickly
many times in succession.

This is because `esp_ping_new_session` has to do a bunch of stuff
including creating a task and a socket. Calling `esp_ping_delete_session`
doesn't clean up these resources immediately. Instead, it signals the
task to clean up resources and exit 'soon', but 'soon' is defined as 1
second.

When the calls are frequent, the in-use sockets and tasks fill up
available slots—I didn't actually check which resource gets used
up first.

With this change, the ping call will raise an exception instead of
continuing with a call to esp_ping_start that crashes.

Closes adafruit#5980 based on my testing on an ESP32S3-N8R2.
jepler added a commit to jepler/circuitpython that referenced this issue May 5, 2023
esp_ping_new_session can fail, particularly if ping is called quickly
many times in succession.

This is because `esp_ping_new_session` has to do a bunch of stuff
including creating a task and a socket. Calling `esp_ping_delete_session`
doesn't clean up these resources immediately. Instead, it signals the
task to clean up resources and exit 'soon', but 'soon' is defined as 1
second.

When the calls are frequent, the in-use sockets and tasks fill up
available slots—I didn't actually check which resource gets used
up first.

With this change, the ping call will raise an exception instead of
continuing with a call to esp_ping_start that crashes.

Closes adafruit#5980 based on my testing on an ESP32S3-N8R2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug espressif applies to multiple Espressif chips network
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants