-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fix #884, protect list concurrent access with mutex #887
base: master
Are you sure you want to change the base?
Conversation
|
||
public: | ||
AsyncPlainLock() { | ||
_lock = xSemaphoreCreateBinary(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xSemaphoreCreateBinary() can fail in case of less memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_lock = xSemaphoreCreateBinary(); | |
_lock = xSemaphoreCreateBinary(); | |
// If this fails, the system is likely so much out of memory that we should abort | |
assert(_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original implementation of the AsyncWebLock (below the new class) does the same without an assert etc.
What would you suggest, put an abort() or assert() there? If yes, this should go in both places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert() is fine and give the developer enough information I think.
} | ||
|
||
bool lock() const { | ||
xSemaphoreTake(_lock, portMAX_DELAY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xSemaphoreTake() can fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edited) No, actually in this case, xSemaphoreTake, which is an alias to xQueueGenericReceive, would only ever have a pdFALSE return value if an effective timeout was supplied as an argument. But the portMAX_DELAY in this case means the timeout is disabled. xSemaphoreTake does either lock indefinitely or return pdTRUE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xQueueGenericReceive, I meant it /can/ only return "pdFALSE" when invoked with a non-infinite timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. I expected a (void)xSemaphoreTake(...)
, but this may depend on the coding style.
xSemaphoreGive(_lock); | ||
} | ||
}; | ||
|
||
// This is the ESP32 version of the Sync Lock, using the FreeRTOS Semaphore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its the ESP8266 version, not ESP32 ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the define is OK and for ESP8266, but still I noticed a bug there, as I accidentally left the "mutable void *_lockedBy;" member in for the supposedly "plain" lock class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, your right here. Looks like I read the comment and scrolled to fast down.
src/AsyncWebSynchronization.h
Outdated
{ | ||
private: | ||
SemaphoreHandle_t _lock; | ||
mutable void *_lockedBy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was supposed to be left out..
mutable void *_lockedBy; |
src/AsyncEventSource.cpp
Outdated
@@ -276,56 +294,70 @@ void AsyncEventSource::_addClient(AsyncEventSourceClient * client){ | |||
client->write((const char *)temp, 2053); | |||
free(temp); | |||
}*/ | |||
|
|||
_client_queue_lock.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By calling the connect callback with the client queue lock held, this code introduces a self-deadlock if the connect callback, in turn, tries to broadcast a message to the event source using AsyncEventSource::send (or tries to use any other public API that tries to take that very same lock).
I had this exact scenario in my project, and I had to fix it as shown in commit yubox-node-org@646eab7 of yuboxfixes branch of the ESPAsyncWebServer fork at yubox-node-org.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just now realized that the extern void *pxCurrentTCB member of class AsyncWebLock does /not/ point to the TCP TCB struct (TCB as in "transfer control block") but to the FreeRTOS stack pointer, and that this is how the AsyncWebLockGuard works..
I only did minimal testing, but I assume this is safe to pull into this PR for completeness.
I hope that the original author does find some time in the near future to go through all these forks and PRs...
Thank you very much for the notice!
Commit 344062a intended to protect the client list for AsyncEventSource from concurrent access by wrapping accesses with locking. However, this introduces a self-deadlock scenario if application code in the onConnect callback (now called with the client list lock held) tries to broadcast a message to existing connections in the same AsyncEventSource (for example, to alert of a new incoming connection). The broadcast call tries to take the lock but blocks because the lock is already taken, by the same task, before calling the callback. Fixed by making use of the AsyncWebLockGuard class which is tailor-made to address this scenario. (cherry picked from commit 646eab7)
Same as PR 886, fix #884 but remastered into single commit.