-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in channel implementation using mutex and condition variables #25
Comments
Thanks for fuzzing domainslib, it is useful. I'm a bit confused how your proposed fix will work. Isn't there a unique mutex+condition_variable for each waiting receiver in Line 150 in 3d5f79c
I can see there is the problem if domainslib is used with systhreads. |
With systhreads and continuations/fibers it is possible for multiple execution contexts (aka 'threads') to share the same I make fixing the problem with systhreads and continuations/fibers being to replace |
I agree on the fix of replacing |
There seem to be a bug in channel implementation using mutex+condition variable as implemented in PR #17 which causes task_exn test to deadlock or loop infintely. The bug manifest itself only in some runs otherwise it works fine. I was able to find this subtle bug using ParaFuzz.
Steps to reproduce
Root cause
I had debugged the test program and suspect that this has to do with channel implementation using single mutex+condvar for all channels. Exact root cause is when two receivers are waiting for a message here and when sender sends a message here, it causes message to get lost. Receivers are waiting on same condition variable for the sender to write to a ref unique per waiting receiver. But the problem occurs when sender writes to
ref r1
(waited upon byreceiver 1
) andCondition_variable.signal
signals wakes upreceiver 2
which findsref r2
not updated, so it waits again. So even after sending a message, both receiver are still waiting and the message is lost. The bug is mainly caused by the underlying assumption that waiting threads are woken up in the order in which they wait on the condition variable which is incorrect according to pthread_cond_signal documentation.Fix
Fix is to use a unique mutex+condition_variable for each waiting receiver.
The text was updated successfully, but these errors were encountered: