-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[4.0] hardening resource monitor manager plugin shutdown handling #1774
Conversation
plugins/resource_monitor_plugin/include/eosio/resource_monitor_plugin/file_space_handler.hpp
Outdated
Show resolved
Hide resolved
plugins/resource_monitor_plugin/include/eosio/resource_monitor_plugin/file_space_handler.hpp
Outdated
Show resolved
Hide resolved
plugins/resource_monitor_plugin/include/eosio/resource_monitor_plugin/file_space_handler.hpp
Outdated
Show resolved
Hide resolved
Those tests were intended to verify the duration of space_monitor_loop. In essence they tested Boost's expires_from_noa, which was not necessary. The tests themselves were hacky and took uncessary 50 seconds.
* removes unnecessary check of info logging "Creating and st arting monitor thread" * reduces time to wait for nodeos startup from 120 seconds to 10 seconds
plugins/resource_monitor_plugin/include/eosio/resource_monitor_plugin/file_space_handler.hpp
Outdated
Show resolved
Hide resolved
boost::system::error_code ec; | ||
timer.cancel(ec); | ||
} | ||
thread_pool.stop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once stop()
returns it's impossible for timer
to be accessed from the thread_pool
, so I'm not immediately seeing a need this mutex: just stop the thread_pool
and let timer
be dtor'ed on main thread as it is already.
If you do that, I also suspect you don't even need
leap/plugins/resource_monitor_plugin/include/eosio/resource_monitor_plugin/file_space_handler.hpp
Line 178 in dc87f58
if ( ec != boost::asio::error::operation_aborted ) { // not cancelled |
because once the
thread_pool
is stopped no callbacks will be run, ergo the cancellation callback of the timer in its dtor will never be run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your insight. I updated the code.
* no need to call cancel timer explicitly. * no need to use mutex for timer as when timer is used on the main thread, resource monitor thread has already stopped.
#1485 reports occasional resource manager plugin test failures. The failure is shown as
/__w/leap/leap/plugins/resource_monitor_plugin/test/test_resmon_plugin.cpp(147): fatal error: in "resmon_plugin_tests/startupNormal": unexpected exception thrown by plugin_startup({"/tmp"})
without any other additional information.
Further investigation reveals the scheduled monitor timer task is not cancelled during plugin shutdown and currently monitor task might be still running before the thread exits. This PR hardens the plugin shutdown process.
Resolves #1485