[3.1] Fix race condition on trace_api_plugin shutdown #592
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Failure https://github.com/eosnetworkfoundation/mandel/runs/7143297502?check_suite_focus=true pointed to an issue in shutdown of the
trace_api_plugin
.slice_directory::stop_maintenance_thread()
sets atomic bool_maintenance_shutdown = true
and callsnotify_one()
on the_maintenance_condition
condition variable. This is a race condition because it doesn't acquire the condition variable mutex setting up the possibility that_maintenance_shutdown
can be set totrue
andnotify_one
called after the while check inslice_directory::start_maintenance_thread
but before thewait()
causing it to wait forever. This then blocks theslice_directory::stop_maintenance_thread()
call ofjoin()
on the main thread, blocking the shutdown of all other plugins.Changed the logic to correctly use a mutex and condition variable around
_best_known_lib
&_maintenance_shutdown
. Also made these two variables non-atomic since they are now only accessed via the mutex. Also release the mutex before running therun_maintenance_tasks()
which appeared to be the original intention with the use of the local variables.Accidentally targeted
main
on #591 so this is a cherry-pick back to3.1
.