Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce approval-voting/distribution benchmark #2621

Merged
merged 247 commits into from
Feb 5, 2024

Conversation

alexggh
Copy link
Contributor

@alexggh alexggh commented Dec 5, 2023

Summary

Built on top of the tooling and ideas introduced in #2528, this PR introduces a synthetic benchmark for measuring and assessing the performance characteristics of the approval-voting and approval-distribution subsystems.

Currently this allows, us to simulate the behaviours of these systems based on the following dimensions:

TestConfiguration:
# Test 1
- objective: !ApprovalsTest
    last_considered_tranche: 89
    min_coalesce: 1
    max_coalesce: 6
    enable_assignments_v2: true
    send_till_tranche: 60
    stop_when_approved: false
    coalesce_tranche_diff: 12
    workdir_prefix: "/tmp"
    num_no_shows_per_candidate: 0
    approval_distribution_expected_tof: 6.0
    approval_distribution_cpu_ms: 3.0
    approval_voting_cpu_ms: 4.30
  n_validators: 500
  n_cores: 100
  n_included_candidates: 100
  min_pov_size: 1120
  max_pov_size: 5120
  peer_bandwidth: 524288000000
  bandwidth: 524288000000
  latency:
    min_latency:
      secs: 0
      nanos: 1000000
    max_latency:
      secs: 0
      nanos: 100000000
  error: 0
  num_blocks: 10

The approach

  1. We build a real overseer with the real implementations for approval-voting and approval-distribution subsystems.
  2. For a given network size, for each validator we pre-computed all potential assignments and approvals it would send, because this a computation heavy operation this will be cached on a file on disk and be re-used if the generation parameters don't change.
  3. The messages will be sent accordingly to the configured parameters and those are split into 3 main benchmarking scenarios.

Benchmarking scenarios

Best case scenario approvals_throughput_best_case.yaml

It send to the approval-distribution only the minimum required tranche to gathered the needed_approvals, so that a candidate is approved.

Behaviour in the presence of no-shows approvals_no_shows.yaml

It sends the tranche needed to approve a candidate when we have a maximum of num_no_shows_per_candidate tranches with no-shows for each candidate.

Maximum throughput approvals_throughput.yaml

It sends all the tranches for each block and measures the used CPU and necessary network bandwidth. by the approval-voting and approval-distribution subsystem.

How to run it

cargo run -p polkadot-subsystem-bench --release -- test-sequence --path polkadot/node/subsystem-bench/examples/approvals_throughput.yaml

Evaluating performance

Use the real subsystems metrics

If you follow the steps in https://github.com/paritytech/polkadot-sdk/tree/master/polkadot/node/subsystem-bench#install-grafana for installing locally prometheus and grafana, all real metrics for the approval-distribution, approval-voting and overseer are available. E.g:
Screenshot 2023-12-05 at 11 07 46

Screenshot 2023-12-05 at 11 09 42 Screenshot 2023-12-05 at 11 10 15 Screenshot 2023-12-05 at 11 10 52

Profile with pyroscope

  1. Setup pyroscope following the steps in https://github.com/paritytech/polkadot-sdk/tree/master/polkadot/node/subsystem-bench#install-pyroscope, then run any of the benchmark scenario with --profile as the arguments.
  2. Open the pyroscope dashboard in grafana, e.g:
Screenshot 2024-01-09 at 17 09 58

Useful logs

  1. Network bandwidth requirements:
Payload bytes received from peers: 503993 KiB total, 50399 KiB/block
Payload bytes sent to peers: 629971 KiB total, 62997 KiB/block
  1. Cpu usage by the approval-distribution/approval-voting subsystems.
approval-distribution CPU usage 84.061s
approval-distribution CPU usage per block 8.406s
approval-voting CPU usage 96.532s
approval-voting CPU usage per block 9.653s
  1. Time passed until a given block is approved
 Chain selection approved  after 3500 ms hash=0x0101010101010101010101010101010101010101010101010101010101010101
Chain selection approved  after 4500 ms hash=0x0202020202020202020202020202020202020202020202020202020202020202

Using benchmark to quantify improvements from #1178 + #1191

Using a versi-node we compare the scenarios where all new optimisations are disabled with a scenarios where tranche0 assignments are sent in a single message and a conservative simulation where the coalescing of approvals gives us just 50% reduction in the number of messages we send.

Overall, what we see is a speedup of around 30-40% in the time it takes to process the necessary messages and a 30-40% reduction in the necessary bandwidth.

Best case scenario comparison(minimum required tranches sent).

Unoptimised

    Number of blocks: 10
    Payload bytes received from peers: 53289 KiB total, 5328 KiB/block
    Payload bytes sent to peers: 52489 KiB total, 5248 KiB/block
    approval-distribution CPU usage 6.732s
    approval-distribution CPU usage per block 0.673s
    approval-voting CPU usage 9.523s
    approval-voting CPU usage per block 0.952s

vs Optimisation enabled

   Number of blocks: 10
   Payload bytes received from peers: 32141 KiB total, 3214 KiB/block
   Payload bytes sent to peers: 37314 KiB total, 3731 KiB/block
   approval-distribution CPU usage 4.658s
   approval-distribution CPU usage per block 0.466s
   approval-voting CPU usage 6.236s
   approval-voting CPU usage per block 0.624s

Worst case all tranches sent, very unlikely happens when sharding breaks.

Unoptimised

   Number of blocks: 10
   Payload bytes received from peers: 746393 KiB total, 74639 KiB/block
   Payload bytes sent to peers: 729151 KiB total, 72915 KiB/block
   approval-distribution CPU usage 118.681s
   approval-distribution CPU usage per block 11.868s
   approval-voting CPU usage 124.118s
   approval-voting CPU usage per block 12.412s

vs optimised

    Number of blocks: 10
    Payload bytes received from peers: 503993 KiB total, 50399 KiB/block
    Payload bytes sent to peers: 629971 KiB total, 62997 KiB/block
    approval-distribution CPU usage 84.061s
    approval-distribution CPU usage per block 8.406s
    approval-voting CPU usage 96.532s
    approval-voting CPU usage per block 9.653s

TODOs

[x] Polish implementation.
[x] Use what we have so far to evaluate #1191 before merging.
[x] List of features and additional dimensions we want to use for benchmarking.
[x] Run benchmark on hardware similar with versi and kusama nodes.
[ ] Add benchmark to be run in CI for catching regression in performance.
[ ] Rebase on latest changes for network emulation.

sandreim and others added 30 commits August 25, 2023 19:15
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
The pr migrates:
- paritytech/polkadot#7554

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
…reim/the_v2_assignments

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
…reim/the_v2_assignments

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
…o feature/approve_multiple_candidates_polkadot_sdk_v2
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
... the param was incorrectly appended to v9 instead of creating a new
version as v10.

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
... failure example here:
  https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/3799036

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
@alexggh alexggh removed request for koute and athei January 22, 2024 07:39
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
@alindima
Copy link
Contributor

Question: are we aiming to first merge #2970 and then rebase this PR, or to first merge this PR into #2970 ?

@alexggh
Copy link
Contributor Author

alexggh commented Jan 23, 2024

Question: are we aiming to first merge #2970 and then rebase this PR, or to first merge this PR into #2970 ?

First merge #2970 and then rebase this PR.

Base automatically changed from sandreim/availability-write-bench to master January 25, 2024 17:52
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
@alexggh alexggh added the R0-silent Changes should not be mentioned in any release notes label Jan 29, 2024
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Copy link
Contributor

@sandreim sandreim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
@alexggh
Copy link
Contributor Author

alexggh commented Feb 2, 2024

Addressed all review feedback, once the CI passes will merge this PR.

@alexggh alexggh added this pull request to the merge queue Feb 5, 2024
Merged via the queue into master with commit f9f8868 Feb 5, 2024
124 checks passed
@alexggh alexggh deleted the alexaggh/subsystem-bench-approvals branch February 5, 2024 07:27
@Polkadot-Forum
Copy link

This pull request has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/what-are-subsystem-benchmarks/8212/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-silent Changes should not be mentioned in any release notes T10-tests This PR/Issue is related to tests. T12-benchmarks This PR/Issue is related to benchmarking and weights.
Projects
Status: Completed
Development

Successfully merging this pull request may close these issues.

5 participants