-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running custom classifier - Slow Inference (TFMIC-13) #25
Comments
Hi @william-hazem The code you are using is already optimized and is taking benefits of esp-nn under the hood. What is the size of the model you're using? and what is the inference time you get? |
Hi @vikramdattu, thanks for answering. My model has 14832 Bytes and tensors arena consumes around 155 kBytes. Currently i'm getting an inference between 4 and 5 secs. My model uses float data, and fews layers... My esp is running with following:
|
@william-hazem can you please change the SPI speed to 80MHz and check? Also, can you please use internal memory for Arena if that's a possibility? Optionally, via menuconfig, you may want to set Task Watchdog timeout to 5 seconds. |
Changing SPI speed to 80MHz didn't improve speed. But I set compiler level to -O2 ("Performance") and then It runned better, the inference time dropped and now is running each inference near to 2400 ms instead of ~4400 ms. I'm actually using this operators, and convolution2D are being very expensive for inference time and using a lot of memory. static tflite::MicroMutableOpResolver<7> resolver;
resolver.AddFullyConnected();
resolver.AddQuantize();
resolver.AddDequantize();
resolver.AddAveragePool2D();
resolver.AddConv2D();
resolver.AddLogistic();
resolver.AddMean(); I think I need to make convolutional layers light, and keep compiling my idf project using -O2 flag to gain Performance. |
Hi, @william-hazem thanks for sharing the experiment of using the optimization flag. Can you please analyze/share the breakup of the contribution of each layer type? You can find a reference here: https://github.com/espressif/tflite-micro-esp-examples/blob/master/examples/person_detection/main/main_functions.cc#L175 Also, you may want to check the share of each layer. |
Hi, @william-hazem have you tried pruning the model and reducing the model size by quantisation to int8. In my case it helped improve the inference speed. |
I achieved better performance by adding some pooling layers to reduce the model size and changing the convolution kernel to reduce computational cost. The model was already quantized using int8, but I noticed a slight latency reduction when using it Thanks by supporting @vikramdattu @SaketNer |
Hi Everyone!
I built my own classifier and embed it into esp32. But all time it's do an inference the watchdog time is triggered.
My loop function:
Is there any way to improve speed without trigger watchdog?
The text was updated successfully, but these errors were encountered: