-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relayer error handling specification #712
Comments
May be worth trying to get these typed upstream in ibc |
I tried to do a full trace of an example error message to see how it is propagated. There is a complicated path from the Hermes relayer to the IBC module, and there are many potential errors that can occur in each step. On the Go side, the error messages are formatted using helpers such as Here is an example error message:
And here is the manual stack trace:
I'm not sure if there are good way to document all errors on the Go code, other than trying to search through the code with keywords such as: |
we get the response from the Tx here: https://github.com/informalsystems/ibc-rs/blob/2592440b0c599304a1032746ce975b2cf7baa673/relayer/src/chain/cosmos.rs#L1479 The response has a We currently take the For IBC the errors (codes) are defined in the If I try to send a And we get this response (prinln from our
I think we need to look at all ibc |
Yes, that is my impression as well. However the error code are just integers specific to the module, and I am not sure if they are guaranteed to not overlap when making a call. My initial thought was to do string pattern matching to determine the exact error. Also, the Go code do not just use So do you mean the main choke point we need to handle is |
they don't within the
yes, but on the relayer side we are interested in that
Sorry, I should have added more details to |
Aha! So that's the devil in the detail! Then I'm confident to interpret the error code directly. Btw I'm still quite confused that when an error happens, if response.check_tx.code.is_err() {
return Ok(vec![IbcEvent::ChainError(format!(
"check_tx reports error: log={:?}",
response.check_tx.log
))]);
} Is that intentional? If not I will just change it to return them as newly classified errors in |
Is that intentional? If not I will just change it to return them as newly classified errors in relayer::errors::Error. that's our current way to distinguish between chain IBC errors and all the rest. There are very few for which we should retry here but we need to go over the full list of IBC SdkErrors first. And yes we can work with typed errors here also. |
Seems like we have some dead code capturing various errors that come from IBC-go. Ref: https://github.com/informalsystems/ibc-rs/pull/2266#issuecomment-1145951834 |
Closing as most of the work has been done in #988 and other PRs over the years. Let' reopen if needed. |
Crate
docs
Summary
Currently the relayer performs retries for any error that occurs while building messages and transactions.
Problem Definition
There are RPC, light client, on-chain, internal errors, etc. The most complex ones are the on-chain errors that are a bit more subtle (leaving aside errors in the relayer code) and also they are not typed (we just get an error message and the reason is embedded in a string). The retry may apply for some cases, in others we should not retry (e.g. a transaction has already been submitted by another relayer), and yet in other cases we should rebuild stuff, etc.
Proposal
Analyze the code, document all errors and the actions to be taken in each case.
Consider also the exponential backoff as per @romac's informalsystems/ibc-rs#709 (comment), where applicable.
For Admin Use
The text was updated successfully, but these errors were encountered: