-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem: wrong Block.Header.AppHash crashes #284
Comments
@hpmv Thanks for reporting the issue. For the wrong app hash error, it'd be helpful if you could provide more details: which block heights, which network (I assume the mainnet beta?), etc. this happens on. Given it was on v0.6.1, it could also be the case that there were some unnoticed consensus state breaking changes between 0.6.1 and 0.6.5. The latest Tendermint has a rollback feature, but it hasn't been used in Cosmos SDK yet: cosmos/cosmos-sdk#10281 @yihuang @JayT106 may advise if there's a manual workaround in the meantime. |
Thanks @tomtau! The network is mainnet beta, and the height was as shown in the error message: 780414. I just got it again at another height 789757. What's the version (commit hash) that's supposed to be running in Mainnet beta? |
@hpmv you might try to update the DB state via modifying the wal, remove the latest messages until the previous However, it is not guaranteed work due to the unknown root cause of the appHash crashes. We need more investigations to understand the issue. |
Thanks Jay! Is this a known issue in the community (I see a previous bug filed about this too)? |
ok, it seems this may be a duplicate issue: #256 |
we observed this in one of our RPC nodes after upgrading to 0.6.6, after inspecting and comparing the iavl storage using iaview tool, we found that this transaction's sender's balance is different between the problematic node and normal node, and the numbers match the hypothesis that the tx is reverted on the problematic node(and the sender's balance is deducted by "gas limit * gas price"), but successfully executed on the normal nodes. |
Is the PR #377 the root cause of the AppHash mismatch in 0.6.6? |
no, that one is released in 0.6.8, but the issue that happens today is for 0.6.6 and above. |
@hpmv , what's your setup for |
From investigating the recent crashes cases, suspect the EVM module might cause the indeterministic result. But we need more crashed databases to identify which part of the EVM module causes the issue. |
This rollback command may help in the future: cosmos/cosmos-sdk#11361 |
We believe the root cause is found, and the workaround for now is to increase the file open limit using |
Describe the bug
My node once in a while crashes with errors like this:
restarting the node gives a similar error.
To Reproduce
Cannot reproduce reliably other than running the node and this might happen once in a couple of days. I'm using version v0.6.1.
Expected behavior
App really should recover from such errors by automatically reverting to the previous height. Or, a manual tool like state_recover from BSC would also be great. Right now there's no solution other than recovering from a disk backup.
Could a dev tell me how to revert back to the previous height manually via setting leveldb keys? I know I need to set a couple of keys in the Tendermint side, but the app side is too confusing to dig in for me. Thanks!
The text was updated successfully, but these errors were encountered: