-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Security: Stop RocksDB threads calling unexpected code when zebrad
exits
#3133
Comments
Hey team! Please add your planning poker estimate with ZenHub @conradoplg @dconnolly @jvff @oxarbitrage @teor2345 @upbqdn |
zebrad
exits
It seems a lot of things are going on here and i spent quite some good amount of time trying several alternatives. Warnings as The downside of this are mainly:
To avoid the last 2 points the non runtime shutdown( https://github.com/ZcashFoundation/zebra/blob/v1.0.0-beta.3/zebrad/src/components/tokio.rs#L62-L63) was introduced. However as this introduces the c++ errors the suggested design was to keep the non runtime shutdown but close the database properly from the application hoping to get rid of them. There were a few options proposed by this, i personally created a service request into the The result was a clean database cleanup without using the runtime destructor but unfortunatly the C++ issues at shutdown do not disapear. So there is something else that causes them, it could be still the rocksdb database but i am not so sure. Another approach to try could be to lunch all the tasks in a thread pool (https://doc.rust-lang.org/book/ch20-03-graceful-shutdown-and-cleanup.html) and try to `abort()`` them at shutdown, this is while we don't use the runtime shutdown. The result of doing that is uncertain so i am not pushing forward for it. In my opinion here is what we can do:
|
There is another case happening here that seems to be independent on if we are shutting down zebra using the runtime or not. This is a hang, when ctrl-c is received zebra will not do anything else but will not exit either:
|
I'm not sure if this change is needed. Have you tried: If you call it here, the runtime should stop waiting for all tasks to finish:
These issues aren't scheduled until the lightwalletd work:
They are tracked in #1351. |
I had been trying today with |
That's really strange, because this is the recommended method, and other projects use it successfully. Did you also delete the calls to |
I didn't deleted the exits, i will try. |
On top the above PR i removed the full shutdown implementation in our side that is doing the exits: https://github.com/ZcashFoundation/zebra/blob/v1.0.0-beta.3/zebrad/src/components/tokio.rs#L46-L71 It didnt worked, i am still seeing:
occasionally. |
Description
When I terminate
zebrad start
using Ctrl-C (SIGINT) on Linux, I get pthread errors.I can reliably generate these errors during startup and the initial peer connections. But they seem much less frequent after initialization.
This is a serious security risk, because calling pure virtual methods is unexpected code execution. So compilers could optimise it into something dangerous, or something that could be exploited. (That doesn't happen in this case - instead, it prints a warning. But there are no guarantees.)
Initial Analysis
We recently made a few changes to Zebra's startup and shutdown behaviour (#3057, #3071, #3078, #3091).
Some of these errors seem to come from C++ code ("pure virtual method called"), so it's probably a combination of:
PR #3091 also stopped
zebrad
deleting ephemeral state directories. We should fix that issue as part of this bug fix.Suggested Design
Precautions:
Shut down in this order:
CancelAllBackgroundWork()
when Zebra wants to exit: https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ ,We can do this via a
shutdown
method in the state service and finalized state, which is called:debug_stop_at_height
, orShutdown
request to the state, in thestart
andcopy-state
(both state services) commands.After we've waited for
shutdown
to finish, it is safe to exit Zebra.Other state requests could be processed while we are exiting Zebra:
shutdown
is calledRejected Alternatives
Rust's
sys::at_exit
handler has been removed:rust-lang/rust#84115
There's no guarantee that
libc::atexit
runs before RocksDB's static destructors, just like there's no guarantee of the exit order between Rust and C:https://stackoverflow.com/questions/35980148/why-does-an-atexit-handler-panic-when-it-accesses-stdout
Steps to Reproduce
I tried this:
zebrad start
Ctrl-C
(SIGINT)Expected Behaviour
I expected to see this happen:
zebrad
terminates quickly, without errors.Actual Behaviour
Instead, this happened:
zebrad
terminates quickly, with errors like:Zebra Logs
Environment
Zebra Version
Operating System
The text was updated successfully, but these errors were encountered: