Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

messageQueue not showing Error when success is no #478

Open
albertov19 opened this issue Jul 21, 2023 · 14 comments
Open

messageQueue not showing Error when success is no #478

albertov19 opened this issue Jul 21, 2023 · 14 comments
Labels
I10-unconfirmed Issue might be valid, but it's not yet known. T1-FRAME This PR/Issue is related to core FRAME, the framework. T6-XCM This PR/Issue is related to XCM.

Comments

@albertov19
Copy link

There are some XCM messages that are failing to execute on the relay chain. We think we know where the issues lies but it is somewhat hard to debug this now that we can't see the execution error of the XCM messages.

image

Any idea how we can see this now?

Thanks

@bkchr
Copy link
Member

bkchr commented Jul 21, 2023

CC @KiChjang

@ggwpez
Copy link
Member

ggwpez commented Jul 21, 2023

You mean on Polkadot or Kusama? What kind of messages are failing? Can you please post the links to the events and the messages themselves?
The MQ pallet got deployed to Polkadot with the .43 upgrade and was longer deployed on Kusama. But there were no error reports on Kusama so far...

@ggwpez
Copy link
Member

ggwpez commented Jul 21, 2023

Okay I assume you talk about this block https://polkadot.subscan.io/block/16477742?tab=event.
So far there were 9 failed MQ dispatches on Polkadot and 42 on Kusama, whereas there were 375 successful ones on Polkadot and 3446 (updated) on Kusama. It looks like that we need to specifically investigate the failing messages.

The failed events in JSON format: failed DOT.json.txt failed KSM.json.txt

@albertov19
Copy link
Author

@ggwpez I think more importantly, to be able to see the errors in Polkadot.js Apps would help debugging

We know why it failed.

Thanks

@ggwpez
Copy link
Member

ggwpez commented Jul 21, 2023

The MQ pallet has no introspection into the error type of the underlying implementation of ProcessMessage since it just returns a success bool (as long as there was no gross error with the processing itself) from here.

You can see the error in the batch ItemFailed event as BadOrigin. The message processor is responsible to emit any events that are needed to be known to the outside world in such a case.

I would like to close this since it is the wrong abstraction to solve your issue.

@xlc
Copy link
Contributor

xlc commented Jul 21, 2023

you can replay block with chopsticks and enable runtime logging to see bit more details about the failing reason.

if that’s not enough, you can add more loggings and override wasm to see additions logs

@girazoki
Copy link
Contributor

@ggwpez this behavior changed with respect to the eventts that the ump queue was before throwing. An example of this is block 16483792.

image

Here we had a Transact instruction that was not specifying enough weight to execute the dispatchable. Before we would have the ump queue throwing a MaxWeightInvalid outcome, now we only see success:false. See block 16250668 as an example

image

We did not find a way to retrieve the execution error anywhere right now other than using chopsticks with additional logs like @xlc suggested.

@girazoki
Copy link
Contributor

girazoki commented Jul 24, 2023

Debugging XCM failures is already not a straight-forward thing, so all facilities are welcome

@albertov19
Copy link
Author

Without needing to use chopsticks, we need to be able to understand why the message fails to easily debug these things. Why remove the error? Now it becomes 100x harder to debug. To drive XCM usability we need to make it easier and ensure it is a "welcoming" experience...

@ggwpez
Copy link
Member

ggwpez commented Jul 24, 2023

There is still an overweight error that the message processor can return: Overweight(Weight). the standard message processor should also return that.

I dont know how exactly Transact handles this, but at least the overweight error should still be there.

@ggwpez
Copy link
Member

ggwpez commented Jul 24, 2023

Without needing to use chopsticks, we need to be able to understand why the message fails to easily debug these things. Why remove the error? Now it becomes 100x harder to debug. To drive XCM usability we need to make it easier and ensure it is a "welcoming" experience...

We are not trying to remove errors or make debugging harder. Aggregating all possible errors from any possible implementation into one huge enum just does not sound like a good idea to me either. The MQ pallet is very abstract and generic. It should not need to know about any downstream error sources besides the ones defined in its trait: ProcessMessageError.

@girazoki
Copy link
Contributor

Ah I see, I understand the problem better now. What about implementing the ProcessMessage trait to pallet-xcm? that way pallet-xcm could deposit an event showing the xcm-execution error before returning true or false

@albertov19
Copy link
Author

Without needing to use chopsticks, we need to be able to understand why the message fails to easily debug these things. Why remove the error? Now it becomes 100x harder to debug. To drive XCM usability we need to make it easier and ensure it is a "welcoming" experience...

We are not trying to remove errors or make debugging harder. Aggregating all possible errors from any possible implementation into one huge enum just does not sound like a good idea to me either. The MQ pallet is very abstract and generic. It should not need to know about any downstream error sources besides the ones defined in its trait: ProcessMessageError.

I understand but as users of XCM, we create tooling and debugging mechanisms expecting certain behaviors. If something is changed upstream, there should be a description of how to act or work around this new behavior.

Do you know a way I can get the error for a given message? In another scenario I'm testing I get Unsupported so that is pretty useless when trying to debug what is going on

@albertov19
Copy link
Author

@ggwpez another issue that is not straightforward is how to weight a message for example. I would like to weight the following XCM messages that will be executed in the relay chain:

ump: {
      "2004": [
        "0x03140004000000000700e40b540213000000000700e40b540200060002286bee02000400183c0135080000140d0102040001010070617261d4070000000000000000000000000000000000000000000000000000",
        "0x03140004000000000700e40b540213000000000700e40b540200060002286bee02000400383c0035080000e803000000900100140d0102040001010070617261d4070000000000000000000000000000000000000000000000000000"
      ]
    }
   

@juangirini juangirini transferred this issue from paritytech/polkadot Aug 24, 2023
@the-right-joyce the-right-joyce added I10-unconfirmed Issue might be valid, but it's not yet known. T1-FRAME This PR/Issue is related to core FRAME, the framework. and removed J2-unconfirmed labels Aug 25, 2023
@franciscoaguirre franciscoaguirre added the T6-XCM This PR/Issue is related to XCM. label Mar 25, 2024
jonathanudd pushed a commit to jonathanudd/polkadot-sdk that referenced this issue Apr 10, 2024
)

* compute required storage keys in the message-lane pallet

* Update modules/message-lane/src/lib.rs

Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>

Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I10-unconfirmed Issue might be valid, but it's not yet known. T1-FRAME This PR/Issue is related to core FRAME, the framework. T6-XCM This PR/Issue is related to XCM.
Projects
Status: Backlog
Development

No branches or pull requests

7 participants