[Das Benchmarks]: Stress test 1 Full or Bridge Node against x Light Nodes #89

derrandz · 2022-10-10T09:50:46Z

Important

Merge when celestiaorg/celestia-node#1376 is merged and go.mod is corrected to point to celestia-node main

Overview

We are interested in benchmarking the bridge node against a multitude of light node groups, starting from (say) a 100 up to 100K

To do so, this PR adds test cases and local telemetry to enable the benchmark alongside with metrics collection for benchmarking results visualization.

More details on #79

Changes

How to run:

Start testground:

$ make tg-start

In another terminal start the telemetry infrastructure

$ make telemetry-infra-up

Run the test-case

$ make tg-run-composition RUNNER=local-docker TESTPLAN=das-benchmarks COMPOSITION=001-lights-dasing-latest-from-bridge-16-50-28

Go to http://localhost:3000 to access Grafana
Add prometheus as a data source
5.1 For the URL, if you are running in a droplet, use your instance's IP instead of localhost
Import the dashboard under ./build/grafana/dashboards/benchmarks.json

Dependencies

This PR depends on #132

Owners: @derrandz

Bidon15 · 2022-10-17T13:31:55Z

docs/test-plans/002-DAS-Benchmarks/test-cases/tc-001-x-light-finish-das-before-block-time.md

+
+1. The Full Node has the latest head
+2. All light nodes are network-bootstrapped and connected to the full node (no discovery required)
+3. Share size is 32


It will be awesome to have 32/64/128 matrix as we are having an assumption that it works already here #96 😅

derrandz · 2022-10-25T17:09:15Z

To configure the celestia node with the benchmark parameters, these two PRs have to go in;

refactor(availability): use functional options pattern to configure availability implementations celestia-node#1224: Makes SampleAmount configurable
refactor(daser): use functional options pattern to configure daser celestia-node#1225: Makes SampleFrom configurable

plans/002-das-benchmarks/tests/sync-latest/run_light.go

derrandz · 2022-11-09T17:17:20Z

The DASer PR is not required to tell configure the light node to start dasing from a specific height. We changed the logic of the test to unblock this PR, so that's no longer required.

The other one that's mentioned in the comment might be required down the line for different sample amounts.

derrandz · 2022-11-09T17:25:11Z

Define the Non Functional Requirements that have to be met for this test plan alongside the metrics to collect and their thresholds.

Referencing #108

derrandz · 2022-11-11T11:24:04Z

Update

The course of this PR is changing to include test isolation from test setup, metrics' collection and the infrastructure to support the metrics collection.

Although hacky, it's convenient to take this route. After getting a fully working version, we will rewrite the history of this PR to clean this up.

More context in here

The required infrastructure efforts are document in #109

derrandz · 2022-11-16T13:06:02Z

Ongoing work to enable blackbox telemetry is in:

feat(telemetry): add blackbox instrumentation to the header module + share module + p2p bandwidth metrics celestia-node#1376

derrandz · 2022-11-22T23:49:43Z

Progress update regarding the blackbox telemetry efforts:

We managed to get a few charts to look at, by which we measure the performance of the bridge node in terms of how well it’s serving the DASing process. At the moment, since we would benchmark a bridge node against a multitude of light nodes, we will go with the option of displaying charts per light node instance (you can choose a random instance from the drop down in the screenshot)

Aggregate charts that display the overall state of the DASing process across all light nodes is the next step. (Check the PR's TODOs)

(Screenshots from a local run with 1 validator, 1 bridge node and 28 light nodes)

The Selection of Light Nodes from the dropdown

derrandz · 2022-11-25T20:56:11Z

Some improvements to charting:

Added the native influx-db metrics for testground to track # of alive light nodes (a way to experiment with testground’s influxdb)
switched histogram chartings to use tabular like logic (see now DASing time and block time charts looking more clear per height)

Makefile

build/docker-compose.yml

testkit/nodekit/node.go

go.mod

Bidon15

let's get his home run and nits figure out later

derrandz · 2023-02-08T14:04:40Z

Final nits tracked in an issue for future resolution #167

Bidon15 reviewed Oct 17, 2022

View reviewed changes

derrandz force-pushed the tp002/das-benchmark branch 2 times, most recently from 9fd0724 to d2023b2 Compare October 20, 2022 15:23

derrandz self-assigned this Oct 24, 2022

derrandz added experiment Experiments to find out either the tech is suitable for our needs test Request for creating a test-case testground related to testground labels Oct 24, 2022

derrandz force-pushed the tp002/das-benchmark branch from d2023b2 to d182c15 Compare October 25, 2022 15:51

derrandz commented Oct 25, 2022

View reviewed changes

plans/002-das-benchmarks/tests/sync-latest/run_light.go Outdated Show resolved Hide resolved

derrandz force-pushed the tp002/das-benchmark branch 3 times, most recently from fcaea7e to 27ab5f7 Compare November 9, 2022 17:10

derrandz force-pushed the tp002/das-benchmark branch from 39bdd5f to 03662cb Compare November 16, 2022 13:04

This was referenced Nov 23, 2022

chore: Adapt Big Blocks tests to 8MB size #123

Merged

feat: add sanity compositions for pfd/gsbn #124

Merged

derrandz force-pushed the tp002/das-benchmark branch from 5a6e90c to 581f078 Compare November 25, 2022 15:11

derrandz added 5 commits November 29, 2022 01:12

doc: add first das benchmark test plan documentation

2246a0e

chore: add filler files to change later

b2d2024

chore/doc: update documentation and fill in missing gaps

f190b90

test(benchmark:wip): 1 bridge node, x light nodes

78381a1

chore: move to the new dir structure

a260ce9

derrandz force-pushed the tp002/das-benchmark branch 2 times, most recently from 5b3778f to c4c03b9 Compare November 29, 2022 01:34

Bidon15 and others added 8 commits February 1, 2023 16:37

chore: bump to fixed cosmos sdk version

c08e22b

fix: revert mempool back to v1 from v2

77cedec

chore: adapt to latest main's config sealing

0ee43e4

fix: use fork of latest main until we fix 1583

d12eb9b

revert: sealing for test-infra

6f69b4a

chore: new changes in bumped versions for app

33e4e29

chore: make test-infra awesome

e4c0ad5

chore: dependencies update

614de11

derrandz force-pushed the tp002/das-benchmark branch from 3086241 to 614de11 Compare February 1, 2023 20:48

derrandz mentioned this pull request Feb 1, 2023

bump(metrics): upgrade otel to the latest version celestiaorg/celestia-node#1537

Merged

5 tasks

derrandz added 3 commits February 7, 2023 16:14

chore: fix dependencies and api changes

dd63f08

Merge branch 'main' into tp002/das-benchmark

fa2e566

pr: implement viet suggestions

09ba6e8

derrandz requested a review from Bidon15 February 8, 2023 12:05

derrandz changed the title ~~TP002: Stress test 1 Full or Bridge Node against x Light Nodes~~ [Das Benchmarks]: Stress test 1 Full or Bridge Node against x Light Nodes Feb 8, 2023

Bidon15 requested changes Feb 8, 2023

View reviewed changes

Makefile Show resolved Hide resolved

build/docker-compose.yml Show resolved Hide resolved

testkit/nodekit/node.go Outdated Show resolved Hide resolved

This was referenced Feb 8, 2023

testground/infra: create otel collector in k8s #165

Closed

testground/config: MakeFile cleanup #166

Open

Bidon15 reviewed Feb 8, 2023

View reviewed changes

go.mod Outdated Show resolved Hide resolved

derrandz added 2 commits February 8, 2023 13:22

dep: use release for celestia-node

f1cc1f9

pr: address viet suggestions;

d57cc55

derrandz requested a review from Bidon15 February 8, 2023 13:28

derrandz added 2 commits February 8, 2023 13:43

chore: smoll fix

2cce283

chore: add missing err handling

bd9f092

Bidon15 approved these changes Feb 8, 2023

View reviewed changes

refactor: simplify testcase name;

3ecf91a

Bidon15 mentioned this pull request Feb 8, 2023

testground/node: check balances after successful submission of PFBs #163

Merged

5 tasks

derrandz merged commit 6465619 into main Feb 8, 2023

Bidon15 mentioned this pull request Feb 9, 2023

[EPIC] Benchmark Test-Plan for Bridge/Full nodes serving shares to LNs #83

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Das Benchmarks]: Stress test 1 Full or Bridge Node against x Light Nodes #89

[Das Benchmarks]: Stress test 1 Full or Bridge Node against x Light Nodes #89

derrandz commented Oct 10, 2022 •

edited

Loading

Bidon15 Oct 17, 2022

derrandz commented Oct 25, 2022

derrandz commented Nov 9, 2022

derrandz commented Nov 9, 2022

derrandz commented Nov 11, 2022 •

edited

Loading

derrandz commented Nov 16, 2022 •

edited

Loading

derrandz commented Nov 22, 2022 •

edited

Loading

derrandz commented Nov 25, 2022

Bidon15 left a comment

derrandz commented Feb 8, 2023

[Das Benchmarks]: Stress test 1 Full or Bridge Node against x Light Nodes #89

[Das Benchmarks]: Stress test 1 Full or Bridge Node against x Light Nodes #89

Conversation

derrandz commented Oct 10, 2022 • edited Loading

Important

Overview

Changes

How to run:

Dependencies

Bidon15 Oct 17, 2022

Choose a reason for hiding this comment

derrandz commented Oct 25, 2022

derrandz commented Nov 9, 2022

derrandz commented Nov 9, 2022

derrandz commented Nov 11, 2022 • edited Loading

Update

derrandz commented Nov 16, 2022 • edited Loading

derrandz commented Nov 22, 2022 • edited Loading

derrandz commented Nov 25, 2022

Bidon15 left a comment

Choose a reason for hiding this comment

derrandz commented Feb 8, 2023

derrandz commented Oct 10, 2022 •

edited

Loading

derrandz commented Nov 11, 2022 •

edited

Loading

derrandz commented Nov 16, 2022 •

edited

Loading

derrandz commented Nov 22, 2022 •

edited

Loading