Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design doc: invalid block progression #20

Merged
merged 1 commit into from
Jun 25, 2024
Merged

Conversation

tynes
Copy link
Contributor

@tynes tynes commented May 22, 2024

Description

A design doc describing the two invariants that can ensure
that the derivation pipeline can always progress without needing
to stall and do a lot of extra work when invalid batches are submitted.

Ideally this can be included in a hardfork before interop ships
to make the derivation pipeline much more hardened for interop.

A design doc describing the two invariants that can ensure
that the derivation pipeline can always progress without needing
to stall and do a lot of extra work when invalid batches are submitted.

Ideally this can be included in a hardfork before interop ships
to make the derivation pipeline much more hardened for interop.

If the batch data [makes it](https://github.com/ethereum-optimism/optimism/blob/8b5e3c264d2a3a114e7df5361acd8d4c1c700f81/op-node/rollup/derive/batch_queue.go#L223)
to the `BatchQueue`, then the batch is either properly deserialized and validated or is mapped into a "deposits only" batch.
This specifically describes the `BatchDrop` cases returned from [CheckBatch](https://github.com/ethereum-optimism/optimism/blob/8b5e3c264d2a3a114e7df5361acd8d4c1c700f81/op-node/rollup/derive/batches.go#L34).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs defined semantics for span batches

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could just say that we apply the rule for every singular batch reconstructed from the span batch? This should already give us the behavior that we need for interop, IIUC.

An alternative would be to drop the whole span batch, but I don't see an advantage from this over going over each singular batch case by case.

@protolambda protolambda self-requested a review May 22, 2024 18:32
Comment on lines +14 to +15
# Problem Statement + Context

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fault-proofs are the main motivation here that I think is missing from the design doc. With the current approach of re-attempting with the next block-input candidate for the same block-height, it becomes quite complicated and expensive to prepare a chain optimistically, as the same block-height can be revisited many times. For interop fault-proofs to combine multiple optimistically processed chains, a lot of dependencies are assumed, and all computational work is wasted if one of the pieces has to be revisited.

Comment on lines +56 to +60
# Risks & Uncertainties

- This adds risk around unsafe head reorgs. To date, there has not been an unsafe head reorg. Cases where batches would be
dropped and then there is another chance to progress the safe head via submitting another batch would be turned into
unsafe reorgs. The software and ops have been hardened after running it in production for some time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth clarifying that going from one failed batch to another different batch at the same L2 block height can be considered a form of "equivocation" (especially since it can be invalidated by a change of message validity on another chain), something that negatively impacts a lot of possible protocol designs, and is hence worth removing, despite the drawback in unsafe-chain recovery here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Also, the actual current Go op-batcher implementation wouldn't resend a batch for the same block height if that batch failed to import during derivation. Only a batcher restart would cause this. I'm also not aware of a single case since the launch of Bedrock where an invalid batch was produced that then got replaces by a valid one for the same L2 block height.

@sebastianst sebastianst self-assigned this Jun 5, 2024
Copy link
Member

@sebastianst sebastianst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal! I'm supportive and the way I think about this is that the current rules allow to resubmit a faulty batch which feels like a defensive mechanism to recover from a faulty sequencer (implementation). However, we've now established enough trust into our implementation that we can take this approach instead. After all, faulty batches shouldn't happen with a correct implementation.

  • are there any other places in the DP where we also want to change the protocol to just derive a deposit-only block instead of causing a pipeline reset?

Comment on lines +56 to +60
# Risks & Uncertainties

- This adds risk around unsafe head reorgs. To date, there has not been an unsafe head reorg. Cases where batches would be
dropped and then there is another chance to progress the safe head via submitting another batch would be turned into
unsafe reorgs. The software and ops have been hardened after running it in production for some time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Also, the actual current Go op-batcher implementation wouldn't resend a batch for the same block height if that batch failed to import during derivation. Only a batcher restart would cause this. I'm also not aware of a single case since the launch of Bedrock where an invalid batch was produced that then got replaces by a valid one for the same L2 block height.

Comment on lines +11 to +12
New derivation pipeline invariants are introduced that are meant to reduce large resets of the derivation pipeline
when handling invalid batches.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to more explicitly explain (probably below, not here) why the new rules avoid larger resets. This could include explaining what the current behavior is under the preconditions of the new rules.

@mslipper
Copy link
Contributor

mslipper commented Jun 25, 2024

This is already an in-flight project, so I'll be merging this in. We can update the design doc in subsequent PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants