Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receive "scan_evt timeout" on serial monitor #5860

Closed
sami1369 opened this issue Nov 8, 2021 · 22 comments
Closed

Receive "scan_evt timeout" on serial monitor #5860

sami1369 opened this issue Nov 8, 2021 · 22 comments
Labels
Status: Test needed Issue needs testing

Comments

@sami1369
Copy link

sami1369 commented Nov 8, 2021

Hi guys
I use an ESP32 as scanner and 2 ESP32 Advertised to Scanner
َAfter A few hours I receive a message on serial monitor in scanner device and scanner doesn't work while other peripherals work perfect.
message is "scan_evt timeout"
my code in scanner:

Sketch:

void Task1code(void *pvParameters)
{
  for (;;)
  {
    BLEScanResults foundDevices = pBLEScan->start(scanTime, true);
    allowScan = true;
    Serial.println("Scan done!");
    pBLEScan->clearResults(); // delete results fromBLEScan buffer to release memory

    vTaskDelay(5);
  }
}

scan time is 3 second.

can you help me?

@juanaviladev
Copy link

Hi! +1, same here, has anyone found any solution to this? I've tried v2.0.2 and can still reproduce it.

Thanks in advance!

@VojtechBartoska VojtechBartoska added the Status: Awaiting triage Issue is waiting for triage label Jan 4, 2022
@beats0126
Copy link

same meet this issues after updating to esp32-arduino v2.0.2

@juanaviladev
Copy link

juanaviladev commented Jan 7, 2022

same meet this issues after updating to esp32-arduino v2.0.2

Tried using ESP-IDF v4.3.1 sdk (without the arduino layer) and still able to reproduce it, looks like an ESP-IDF problem with VHCI. espressif/esp-idf#4001

Anyway, has anyone found any way to handle the error? It's logged in the serial output, but I haven't found any callbacks or events that can be used to restart if "scan_evt timeout" occurs.

@stevenli2020
Copy link

i had the same "scan_evt timeout" issue, been scratching my head for the past 2 weeks. did anyone managed to find a fix?

@stevenli2020
Copy link

currently i am testing adding a delay of 10ms before "pBLEScan->clearResults();" - suspect clearResults happens too fast while callback function is still retrieving data. not sure if this could help...
the test has been going on for 2 days now, so far still running ok

@VojtechBartoska VojtechBartoska added the Status: Test needed Issue needs testing label Mar 31, 2022
@stevenli2020
Copy link

stevenli2020 commented Apr 1, 2022

i have bad news and good news.
bad news - the test failed after continuous scanning for about 60 hours, with the same Scan_evt Timeout error and everything fell into a deadlock. so adding the delay did not solve the problem.
however, good news is - i have successfully force the board to reboot when this error happens, by using Task Watchdog Timer (TWDT) .
here are the code fragments -

#include <esp_task_wdt.h>
..
int _T;
int scanTime = 2; //In seconds
int scanInterval = 10; //In mili-seconds
void setup() {
...
  esp_task_wdt_init(3, true); //enable panic so ESP32 restarts, interrupt when task executed for more than 3 secons
  esp_task_wdt_add(NULL); //add current thread to WDT watch 
  _T = millis(); 
...
}
void loop() {
  for(int i=0; i<1000; i++){
    Serial.printf("<%d>\n",i); 
    delay(scanInterval);
    if(!scanBLE()){
      initBLEScan();
      esp_task_wdt_reset(); // reset the watchdog timer
    }
    checkSerialCmd();
    checkWifi();
    digitalWrite(LED_BUILTIN, LOW);   
  }
  initBLEScan();
}

In the above codes, the scanning time is set to 2 seconds and scan interval 10ms, the tested time between each scan is around 2040ms, therefore the TWDT is set to 3 seconds - when scan_evt timeout occures, if hangs for more than 3 seconds, the TWDT will reboot the device.
After running this codes over night, it worked!
below are the logs i have captured when the scan_evt timeout happens -

...

<626> Publishing > 0d:b297:0d27:0072
Interval=2064, Scan done! Devices found: 19
<627> Interval=2038, Scan done! Devices found: 24
<628> Interval=2041, Scan done! Devices found: 19
<629> Publishing > 0d:b298:0d23:0074
Interval=2040, Scan done! Devices found: 17
<630> Interval=2040, Scan done! Devices found: 21
<631> Interval=2039, Scan done! Devices found: 21
<632> Publishing > 0d:b299:0d2a:0071
Interval=2071, Scan done! Devices found: 18
scan_evt timeout
<633> E (19766618) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (19766618) task_wdt:  - loopTask (CPU 1)
E (19766618) task_wdt: Tasks currently running:
E (19766618) task_wdt: CPU 0: IDLE
E (19766618) task_wdt: CPU 1: IDLE
E (19766618) task_wdt: Aborting.

abort() was called at PC 0x400e33cd on core 0


Backtrace:0x400834dd:0x3ffbe9ac |<-CORRUPTED




ELF file SHA256: 0000000000000000

Rebooting...
ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x1b (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0030,len:1184
load:0x40078000,len:12804
ho 0 tail 12 room 4
load:0x40080400,len:3032
entry 0x400805e4

New bootup [1097BD1A8454]

MQTT - waiting for connection.


Connecting to WiFi .....
MQTT - waiting for connection.
192.168.3.14
Scan initializing ...
<0> Interval=2018, Scan done! Devices found: 17
<1> Publishing > 0d:b29b:0d21:0072
Interval=2065, Scan done! Devices found: 22
<2> Interval=2049, Scan done! Devices found: 20

...

the entire recovery took about 5s before the ESP32 is up and running again, this walk around is not perfect solution but works for my project.
hope it helps!

@VojtechBartoska
Copy link
Contributor

Hello@stevenli2020, thanks for giving it a shot. Did you run it on 2.0.3-RC1?

@stevenli2020
Copy link

Hello@stevenli2020, thanks for giving it a shot. Did you run it on 2.0.3-RC1?

I have not test it on 2.0.3-RC1, my boards are FireBeetle ESP32. however i believe the TWDT will still be supported on the 2.0.3-RC1, isnt it?

@Whoswatching89
Copy link

Still running into this issue tried your fix @stevenli2020 getting error 'scanBLE' was not declared in this scope.

@thorrak
Copy link

thorrak commented Mar 7, 2023

I’ve been receiving this issue as well - though via h2zero’s amazing NimBLE port for Arduino. The Bluetooth stack stops responding every few hours (though in some cases within 15 minutes, and in others lasting 8 hours+). This is with the latest Arduino framework, built upon esp-idf v4.4.4.

1technophile added a commit to 1technophile/OpenMQTTGateway that referenced this issue Apr 6, 2023
Implement a BLE scan watchdog to bypass the problem related to the BLE scan hang scan_evt timeout
This is a bypass solution to espressif/arduino-esp32#5860
The watchdog will restart the ESP if no new BLE messages has been added to the queue following:

checked every 120s
if we are after the last BLE message time + the BLE scan interval for passive
if the process is not locked by an OTA update or other operation
We restart the ESP
We also remove the WDT0 enable and disable functions.
@italocjs
Copy link

italocjs commented Sep 8, 2023

Hello, i'm suffering from the same issue, i have set a few debug messages but didnt manage to find exactly when it happens. Sometimes it crashes when bt is connecting, some times when its already connected. Did anyone found a fix? (i've implemented the watchdog and it does help by rebooting the board)

scan_evt timeout
FEX 0 8
FEX 0 8
FEX 0 1
FEX 0 1
FEX 0 8
FEX 0 8
FEX 0 1
FEX 0 8
FEX 0 8
FEX 0 1
[102472][I][BluetoothSerial.cpp:296] esp_spp_cb(): ESP_SPP_CLOSE_EVT status:0 handle:129 close_by_remote:1 attempt 0
[102473][I][bluetooth.h:91] esp_spp_cb_custom_callback(): [core xvm protocol] ESP_SPP_CLOSE_EVT status:0 handle:129 close_by_remote:1 attempt 0
FEX 0 8
FEX 0 8
FEX 0 8
FEX 1 8
ets Jul 29 2019 12:21:46

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:13192
load:0x40080400,len:3028
entry 0x400805e4

@dpnebert
Copy link
Contributor

dpnebert commented Oct 6, 2023

I have a work project, so I can't share the code, but I will say that after I went in and removed all my Serial.println statements (because I was done developing/testing that part of the code), I would get the timeout pretty quickly. I had to revert back to the code that had all the print statements, and I haven't seen it since.

I will say that I have a task running the BLEScan, and when the results are ready, I suspend the task. The application logic has a conditional to check if the task is suspended:

if (eTaskGetState(scan_handle) == eSuspended) { ... }

And after I process the scan results, I call resume on that task, but not before clearing the results:

advertisedDevices.clear();
vTaskResume(scan_handle);

So it isn't like the results get cleared before being processes. Maybe starting the scan to often doesn't give the internals time?

EDIT: I figured it was worth mentioning 'pBLEScan' doesn't have a 'clear' method, it has 'clearResults'. But I pass in a vector to the task:

std::vector<BLEAdvertisedDevice*>* advertisedDevices = static_cast<std::vector<BLEAdvertisedDevice*>*>(parameters);

@redakker
Copy link

Is there any update here? I also experienced the issue.

@dpnebert
Copy link
Contributor

dpnebert commented Nov 20, 2023

Is there any update here? I also experienced the issue.

Only thing I can figure so far is application code is not yielding to other processes/tasks and the scan is timing out. Try adding short delay calls to other tasks to allow it to yield, increasing the priority of the scan task, or even add short delays around:

void scan_task(void * parameters) {
  
  std::vector<BLEAdvertisedDevice*>* advertisedDevices = static_cast<std::vector<BLEAdvertisedDevice*>*>(parameters);
  
  int scanTime = 3; //In seconds
  BLEScan* pBLEScan;

  pBLEScan = BLEDevice::getScan(); //create new scan
  pBLEScan->setAdvertisedDeviceCallbacks(new AdvertisedDeviceCallbacks(advertisedDevices));
  pBLEScan->setActiveScan(true); //active scan uses more power, but get results faster
  pBLEScan->setInterval(100);
  pBLEScan->setWindow(99);  // less or equal setInterval value
  
  for( ;; ) {
    pBLEScan->start(scanTime, false);
    Serial.println("Scan done!");
    
    vTaskSuspend(NULL);
    vTaskDelay(20 / portTICK_PERIOD_MS ); 
    pBLEScan->clearResults();   // delete results fromBLEScan buffer to release memory
    vTaskDelay(20 / portTICK_PERIOD_MS ); 
  }
}

Not sure if it will help, but I can't remember exactly how I got that to work.

@redakker
Copy link

@dpnebert
Thanks for the answer.
I've tried the following:

#define BT_DEFAULT_SCAN_DURATION_IN_SECONDS 10
void loop () {
        if (millis() - lastRun > (BT_DEFAULT_SCAN_DURATION_IN_SECONDS * 1000) + 2000) {
            pBLEScan->clearResults();   // delete results fromBLEScan buffer to release memory
            
            BLEScanResults foundDevices = pBLEScan->start(BT_DEFAULT_SCAN_DURATION_IN_SECONDS, false);                        
            lastRun = millis();
        }
}

You can see the whole code here if you want:
https://github.com/redakker/blecker/blob/main/src/bluetooth.cpp

So I "manually" interrupt the scan and give 2 seconds for the scanner. It is useful for other services too, because otherwise BLE scanner occupies the processor and for example web serving will be very slow.
I clear the result right before the next scan, I can be sure it will be a delay right after the scan period.
It is under testing now, after couple of hours there is no issue.
Let's see...

@dpnebert
Copy link
Contributor

dpnebert commented Nov 20, 2023

I know it might seem like over kill, but here is an example of how I use freeRTOS tasks to allow for adding delays into your code that don't block other parts of your code.

#include <BLEDevice.h>
#include <BLEUtils.h>
#include <BLEScan.h>

#define SCAN_DELAY_MS       30000

TaskHandle_t scan_handle;

void setup() {
  // I start a freeRTOS task so I can start,
  // stop, even delay for x amount of time.
  xTaskCreatePinnedToCore(scan_task,        // Method name
                          "BLE Scan Task",  // Only for humans for debug
                          1024*2,           // How many bytes should be alloted.
                          NULL,             // Pass in variable reference here (or NULL)
                          8,                // Priority of task
                          &scan_handle,      // Reference to Task handle.  Ex: to delete the scan task, it would look like: "vTaskDelete(scan_handle);"
                          1);               // Which ESP32 core to run this task on.
  // (https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/freertos_additions.html#_CPPv423xTaskCreatePinnedToCore14TaskFunction_tPCKcK8uint32_tPCv11UBaseType_tPC12TaskHandle_tK10BaseType_t)

}

void loop() {
  // I don't normally use the 'loop'.
  // Read up on freeRTOS for ESP32
  // to learn how the Arduino 'loop' handled.
  // So I normally start a "application logic"
  // task in the setup and delete the 'loop'
  vTaskDelete(NULL);
  // (https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/freertos_idf.html?highlight=vtaskdelay#_CPPv411vTaskDelete12TaskHandle_t)
}

void scan_task(void * parameters) {

  int scanTime = 3; //In seconds
  BLEScan* pBLEScan;

  pBLEScan = BLEDevice::getScan(); //create new scan
  pBLEScan->setActiveScan(true); //active scan uses more power, but get results faster
  pBLEScan->setInterval(100);
  pBLEScan->setWindow(99);  // less or equal setInterval value
  
  for( ;; ) {
    pBLEScan->start(scanTime, false);
    Serial.println("Scan done!");

    // Tell the task how long to delay for:
    vTaskDelay(SCAN_DELAY_MS / portTICK_PERIOD_MS ); // <- Or use "delay()"
    // (https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/freertos_idf.html?highlight=vtaskdelay#_CPPv410vTaskDelayK10TickType_t)
    
    pBLEScan->clearResults();   // delete results fromBLEScan buffer to release memory
    vTaskDelay(20 / portTICK_PERIOD_MS ); 
  }
}

Essentially, the code starts a task that has one job: ble scan. We can add a delay statement in that task to wait X amount of time before starting up again. Two reasons why this is helpful. One, it keeps us from having to keep checking all the time, freeing up those cycles to do other stuff. Two, we can run other tasks while the scan task is delayed.

@redakker
Copy link

@dpnebert
Cool, the first sketch is working on my side.
I'll change my code and test again.
Many thanks!

@Darmandran
Copy link

Darmandran commented Feb 20, 2024

my older arduino framework didn't have any issues with the scan for ESP32, but after updating the framework this year to latest due to change of PC, i stated to having issues. so i revert back to older framework again last week and so far its been okay.

my initial framework espressif32@3.5.0+sha.5fdb90c, i cant be sure since i changed PC and not sure what
but currently i am using platform = espressif32@3.5.0

but i really do hope to find a solution to NimBle library for Scan to be used on newer frameworks.

@redakker
Copy link

Since I've changed the active scan to false, this is not an issue anymore

@thorrak
Copy link

thorrak commented Feb 29, 2024

Since I've changed the active scan to false, this is not an issue anymore

This is exactly the problem though -- it breaks when active scanning, and is OK when not active scanning. There are a number of devices that I want to integrate with which require active scanning to work.

@Darmandran
Copy link

I see,
so far been a week platform = espressif32@3.5.0 with active scan = true been doing well
i will give a test with. platform = espressif32@6.4.0 with active scan = false,

will concurrently run both test and see.
thank you for the pointers.

sh1970 pushed a commit to sh1970/OpenMQTTGateway that referenced this issue Aug 23, 2024
Implement a BLE scan watchdog to bypass the problem related to the BLE scan hang scan_evt timeout
This is a bypass solution to espressif/arduino-esp32#5860
The watchdog will restart the ESP if no new BLE messages has been added to the queue following:

checked every 120s
if we are after the last BLE message time + the BLE scan interval for passive
if the process is not locked by an OTA update or other operation
We restart the ESP
We also remove the WDT0 enable and disable functions.


Former-commit-id: 145cd2b
@Parsaabasi Parsaabasi removed the Status: Awaiting triage Issue is waiting for triage label Jan 16, 2025
@Parsaabasi
Copy link

Hello,

Due to the overwhelming volume of issues currently being addressed, we have decided to close the previously received tickets. If you still require assistance or if the issue persists, please don't hesitate to reopen the ticket.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Test needed Issue needs testing
Projects
Development

No branches or pull requests