Running custom classifier - Slow Inference (TFMIC-13) #25

william-hazem · 2022-11-11T00:24:20Z

Hi Everyone!

I built my own classifier and embed it into esp32. But all time it's do an inference the watchdog time is triggered.

I (13060) Projeto: img 1 -> Inference 0 = car
E (18070) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (18070) task_wdt:  - IDLE (CPU 0)
E (18070) task_wdt: Tasks currently running:
E (18070) task_wdt: CPU 0: main
E (18070) task_wdt: CPU 1: IDLE
E (18070) task_wdt: Print CPU 0 (current core) backtrace


Backtrace: 0x400E33EF:0x3FFB0B90 0x400827F9:0x3FFB0BB0 0x400E17D1:0x3FFDC310 0x400DBF6A:0x3FFDC3B0 0x400DF0A6:0x3FFDC750 0x400D65F6:0x3FFDC780 0x400D56E3:0x3FFDC7A0 0x400D55D2:0x3FFDC7D0 0x400F283B:0x3FFDC7F0 0x40087D31:0x3FFDC810
0x400e33ef: task_wdt_isr at C:/Espressif/frameworks/esp-idf-v4.4-2/components/esp_system/task_wdt.c:183 (discriminator 3)

0x400827f9: _xt_lowint1 at C:/Espressif/frameworks/esp-idf-v4.4-2/components/freertos/port/xtensa/xtensa_vectors.S:1111

0x400e17d1: esp_nn_conv_s8_opt at C:/Users/willi/OneDrive/Documentos/PlatformIO/IDF/projeto/components/esp-nn/src/convolution/esp_nn_conv_opt.c:157 (discriminator 2)

0x400dbf6a: tflite::(anonymous namespace)::Eval(TfLiteContext*, TfLiteNode*) at C:/Users/willi/OneDrive/Documentos/PlatformIO/IDF/projeto/components/tflite-lib/tensorflow/lite/micro/kernels/esp_nn/conv.cc:230
 (inlined by) Eval at C:/Users/willi/OneDrive/Documentos/PlatformIO/IDF/projeto/components/tflite-lib/tensorflow/lite/micro/kernels/esp_nn/conv.cc:293

0x400df0a6: tflite::MicroGraph::InvokeSubgraph(int) at C:/Users/willi/OneDrive/Documentos/PlatformIO/IDF/projeto/components/tflite-lib/tensorflow/lite/micro/micro_graph.cc:172

0x400d65f6: tflite::MicroInterpreter::Invoke() at C:/Users/willi/OneDrive/Documentos/PlatformIO/IDF/projeto/components/tflite-lib/tensorflow/lite/micro/micro_interpreter.cc:285

0x400d56e3: loop at C:/Users/willi/OneDrive/Documentos/PlatformIO/IDF/projeto/main/main_functions.cc:165

0x400d55d2: app_main at C:/Users/willi/OneDrive/Documentos/PlatformIO/IDF/projeto/main/main.cc:21 (discriminator 1)

0x400f283b: main_task at C:/Espressif/frameworks/esp-idf-v4.4-2/components/freertos/port/port_common.c:141 (discriminator 2)

0x40087d31: vPortTaskWrapper at C:/Espressif/frameworks/esp-idf-v4.4-2/components/freertos/port/xtensa/port.c:131


E (18070) task_wdt: Print CPU 1 backtrace


Backtrace: 0x40084221:0x3FFB1190 0x400827F9:0x3FFB11B0 0x4000BFED:0x3FFDD610 0x40087FE2:0x3FFDD620 0x400E364C:0x3FFDD640 0x400E3657:0x3FFDD670 0x400D20A5:0x3FFDD690 0x400866BA:0x3FFDD6B0 0x40087D31:0x3FFDD6D0
0x40084221: esp_crosscore_isr at C:/Espressif/frameworks/esp-idf-v4.4-2/components/esp_system/crosscore_int.c:92

0x400827f9: _xt_lowint1 at C:/Espressif/frameworks/esp-idf-v4.4-2/components/freertos/port/xtensa/xtensa_vectors.S:1111

0x40087fe2: vPortClearInterruptMaskFromISR at C:/Espressif/frameworks/esp-idf-v4.4-2/components/freertos/port/xtensa/include/freertos/portmacro.h:571
 (inlined by) vPortExitCritical at C:/Espressif/frameworks/esp-idf-v4.4-2/components/freertos/port/xtensa/port.c:319

0x400e364c: esp_task_wdt_reset at C:/Espressif/frameworks/esp-idf-v4.4-2/components/esp_system/task_wdt.c:330

0x400e3657: idle_hook_cb at C:/Espressif/frameworks/esp-idf-v4.4-2/components/esp_system/task_wdt.c:80

0x400d20a5: esp_vApplicationIdleHook at C:/Espressif/frameworks/esp-idf-v4.4-2/components/esp_system/freertos_hooks.c:51 (discriminator 1)

0x400866ba: prvIdleTask at C:/Espressif/frameworks/esp-idf-v4.4-2/components/freertos/tasks.c:3987 (discriminator 1)

0x40087d31: vPortTaskWrapper at C:/Espressif/frameworks/esp-idf-v4.4-2/components/freertos/port/xtensa/port.c:131


I (19400) Projeto: img 2 -> Inference 0 = car

My loop function:

void loop() {

  static int x = 0;

  if(imBuffer == NULL)
  {
    imBuffer = (float*) malloc(sizeof(float)*WIDTH*HEIGHT);
    if(!imBuffer)
    {
      ESP_LOGE(TAG, "Espaço não alocado - ABORTANDO");
      return;
    }

    ESP_LOGI(TAG, "Espaço alocado");
  }
  // retrieve image from flash
  getImage(imBuffer, x);
  for(int i = 0; i < WIDTH*HEIGHT; i++)
  {
    input->data.f[i] = imBuffer[i];
  }

  vTaskDelay(1);
  
  TfLiteStatus statusInvoke = interpreter->Invoke();
  if(statusInvoke != TfLiteStatus::kTfLiteOk)
  {
    ESP_LOGE(TAG, "Ocorreu um erro na inferência (%d)", statusInvoke);
    return;
  }
  
  float result = output->data.f[0];
  int pred = result > 0.5 ? 1 : 0;
  ESP_LOGI(TAG, "img %d -> Inference %d = %s", x, pred, labels[pred]);

  x = x < 4 ? x+1 : 0;
  vTaskDelay(10 / portTICK_PERIOD_MS);
}

Is there any way to improve speed without trigger watchdog?

The text was updated successfully, but these errors were encountered:

vikramdattu · 2022-11-14T10:58:14Z

Hi @william-hazem The code you are using is already optimized and is taking benefits of esp-nn under the hood.

What is the size of the model you're using? and what is the inference time you get?
Please make sure to turn on QIO flash option and make sure the CPU is clocking at 240MHz if you're not.

william-hazem · 2022-11-15T03:00:09Z

Hi @vikramdattu, thanks for answering.

My model has 14832 Bytes and tensors arena consumes around 155 kBytes. Currently i'm getting an inference between 4 and 5 secs. My model uses float data, and fews layers...

My esp is running with following:

CPU 240MHz
Flash Mode: QIO
Flash SPI speed 40MHz
ESP-NN: Optimized versions

vikramdattu · 2022-11-15T06:31:46Z

@william-hazem can you please change the SPI speed to 80MHz and check? Also, can you please use internal memory for Arena if that's a possibility?

Optionally, via menuconfig, you may want to set Task Watchdog timeout to 5 seconds.

william-hazem · 2022-11-21T00:00:50Z

Changing SPI speed to 80MHz didn't improve speed. But I set compiler level to -O2 ("Performance") and then It runned better, the inference time dropped and now is running each inference near to 2400 ms instead of ~4400 ms.

I'm actually using this operators, and convolution2D are being very expensive for inference time and using a lot of memory.

static tflite::MicroMutableOpResolver<7> resolver;
resolver.AddFullyConnected();
resolver.AddQuantize();
resolver.AddDequantize();
resolver.AddAveragePool2D();
resolver.AddConv2D();
resolver.AddLogistic();
resolver.AddMean();

I think I need to make convolutional layers light, and keep compiling my idf project using -O2 flag to gain Performance.

vikramdattu · 2022-11-21T06:41:35Z

Hi, @william-hazem thanks for sharing the experiment of using the optimization flag.

Can you please analyze/share the breakup of the contribution of each layer type? You can find a reference here: https://github.com/espressif/tflite-micro-esp-examples/blob/master/examples/person_detection/main/main_functions.cc#L175

Also, you may want to check the share of each layer.
Please note that ESP-NN optimizations are not as effective on ESP32 as on ESP32-S3 which has AI instructions.

SaketNer · 2023-12-31T15:59:53Z

Hi, @william-hazem have you tried pruning the model and reducing the model size by quantisation to int8. In my case it helped improve the inference speed.

william-hazem · 2024-05-12T14:08:13Z

I achieved better performance by adding some pooling layers to reduce the model size and changing the convolution kernel to reduce computational cost. The model was already quantized using int8, but I noticed a slight latency reduction when using it

Thanks by supporting @vikramdattu @SaketNer

william-hazem changed the title ~~Running custom classifier~~ Running custom classifier - Slow Inference Nov 13, 2022

github-actions bot changed the title ~~Running custom classifier - Slow Inference~~ Running custom classifier - Slow Inference (TFMIC-13) Dec 31, 2023

william-hazem closed this as completed May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running custom classifier - Slow Inference (TFMIC-13) #25

Running custom classifier - Slow Inference (TFMIC-13) #25

william-hazem commented Nov 11, 2022 •

edited

Loading

vikramdattu commented Nov 14, 2022

william-hazem commented Nov 15, 2022

vikramdattu commented Nov 15, 2022

william-hazem commented Nov 21, 2022

vikramdattu commented Nov 21, 2022

SaketNer commented Dec 31, 2023

william-hazem commented May 12, 2024

Running custom classifier - Slow Inference (TFMIC-13) #25

Running custom classifier - Slow Inference (TFMIC-13) #25

Comments

william-hazem commented Nov 11, 2022 • edited Loading

vikramdattu commented Nov 14, 2022

william-hazem commented Nov 15, 2022

vikramdattu commented Nov 15, 2022

william-hazem commented Nov 21, 2022

vikramdattu commented Nov 21, 2022

SaketNer commented Dec 31, 2023

william-hazem commented May 12, 2024

william-hazem commented Nov 11, 2022 •

edited

Loading