Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate blockstore-processor for --block-verification-method #4728

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

steviez
Copy link

@steviez steviez commented Jan 31, 2025

Problem

Unified scheduler has proven itself to be much more performant, and with future plans to increase CU's, that gap will likely widen.

Summary of Changes

Unified scheduler is already the default, but make blockstore-processor emit a warning message that it will be completely removed in the future

Data

Much of this data is probably redundant to data that Ryo originally provided, but it doesn't hurt to measure again with recent data before we start phasing blockstore-processor out.

Startup Take 1

Here is one restart against MNB.

  • Blue is a control node
  • Purple is running --block-verification-method unified-scheduler
  • Orange is running --block-verification-method blockstore-processor
  • Purple and orange have same server config in same DC, so about as close as we can reasonably get to "same hardware"
image

Each node came online ~1k slots back; unified caught up in ~6 minutes whereas blockstore-processor took ~22 minutes.

Startup Take 2

I switched which node was running which method, and restarted the nodes after an intentional 5 minute delay between process stop and process restart.
image

Each node was 1400-1500 slots back; unified caught up in ~9 minutes whereas blockstore-processor required ~28 minutes

Steady State

For the next two graphs, these nodes are running as follows:

  • Blue is blockstore-processor
  • Purple is unified-scheduler

One metric for consideration is replay-slot-stats.replay_total_elapsed which is the total time from bank creation to bank frozen. In this metric, unified shows an advantage by a few %. Note that the delivery of shreds will somewhat limit how much better one method can be than another; we see unified have a clear advantage at startup when the whole block is ready:
image

Another metric for consideration is replay-slot-stats.execute_us, which is the total amount of time spent executing transactions across all threads. We see a non-trivial difference here where blockstore-processor is using less total CPU time. Given that these nodes are replaying the same transactions, that means blockstore-processor is doing so more efficiently:
image

Ultimately, I think we care most about completing replay as fast as possible, and 25% more CPU time is acceptable. This final chart shows replay-loop-timing-stats.wait_receive_elapsed_us. In this graph, we can see that unified is spending much more time waiting for shreds. This means that it is scheduling replay of transactions much quicker and waiting to be fed more. This isn't quite an apples-to-apples comparison given that unified schedules transactions for replay in a non-blocking, asynchronous manner:
image

@steviez steviez force-pushed the val_deprecate_blockstore_processor branch from 939b0fd to 3896364 Compare January 31, 2025 19:13
apfitzge
apfitzge previously approved these changes Feb 6, 2025
Copy link

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - deprecation is the first step to removal. The new approach is much faster at catching up, there's no real reason to use the old approach.

@steviez steviez force-pushed the val_deprecate_blockstore_processor branch from 3896364 to 685ad90 Compare February 13, 2025 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants