-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mutex for queue processing. #24
Add mutex for queue processing. #24
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with ESP32S, resolves the duplicate line issues.
Did some load testing and I have observed exceptions in the process. Used a WEB based ESP trace dump processor however it does not output anything. Not sure why. Will need to investigate further. Happens when the logger is under pressure but it could be a number of things such as the WEB server, HA API, uart or mqtt. |
Can you share the test code so I can reproduce it locally? |
Observed with this code https://github.com/GelidusResearch/grps when the number component values are rapidly fired in the WEB UI it sends out significant logging and HA API publish events. The exception occurs at random points under this pressure. I don't think this change is the root cause but I have not ruled it out. Removing the log queuing suppressed the exception frequency but that's not and indication either since there are other issues present like API pressure. Really need to see the where it was in the stack dump but thats not working atm. |
Update: Tested this more, when the serial logging outputs to the local port and or the Async Web Server we do not observe any exceptions with a queue length of up to 24. At a queue length of 32 it will exception. I have not decoded the trace yet. The queue max is hit and a queue limit error is logged at 24. At 32 no limit error is logged. |
I can see the log output esphome:
name: basic
esp32:
board: esp32dev
framework:
type: arduino
logger:
level: VERBOSE
log_queue_length: 32
interval:
- interval: 1s
then:
lambda: |-
int num_executions = 0;
for (int i = 0; i < 50; i++) {
ESP_LOGI("Test", "I am at execution number %d", num_executions);
id(s01).publish_state(num_executions);
id(s02).publish_state(num_executions);
id(s03).publish_state(num_executions);
num_executions++;
}
number:
- platform: template
id: s01
name: s01
optimistic: true
min_value: 0
max_value: 100
step: 1
- platform: template
id: s02
name: s02
optimistic: true
min_value: 0
max_value: 100
step: 1
- platform: template
id: s03
name: s03
optimistic: true
min_value: 0
max_value: 100
step: 1
web_server:
port: 80
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password When the ESP32 is crashing very different exceptions very time it is possible that it is an out of memory situation. |
Do you have an HA instance available, It could be the api socket. I will try a test without and api socket. |
Ok finally captured the stack dump. PC died just after _runQueue, looks like the Q pointer is out of bounds to me.
|
My test is invalid, mutex was not in the code, must have been overwritten. Re-applying and retesting. |
My bad, with the mutex back in place it's functional. Sorry. |
I have the feeling the hole code is not perfect. |
Yep, same page here. One small step at a time. Thanks for working on it. |
Hi @jesserockz, do you have time to review this PR? |
FYI I came across another implementation here (https://github.com/me-no-dev/ESPAsyncWebServer/pull/887/files), explained here (Aircoookie/WLED#3382 (comment)) |
As part of esphome/esphome#5373 I identified that the method
AsyncEventSourceClient::_runQueue
is not multi threading safe.Multiple threads sending messages will result in the same message being transmitted multiple times to the client.
This PR adds a mutex to prevent this.