Block propagation delay under load #525

heifner · 2022-11-30T21:38:15Z

During start_block a deadline is set so that trx processing does not exceed the configured block production window according to produce-time-offset-us and cpu-effort-percent. Pending transactions are processed in a tight loop on the main thread until all pending are processed or the deadline is hit.

For a production block, if a block comes in during the block production window it is dropped.

For a speculative block, the incoming block is not processed until the main thread is released for processing the next task. Currently this means that the incoming block is not processed until after the completion of start_block. This can delay the incoming block processing by as much as 500ms.

Block propagation can be improved during high load by exiting speculative block start_block when a block comes into the node so that it can be processed immediately after the execution of the current trx.

The text was updated successfully, but these errors were encountered:

heifner · 2022-11-30T21:40:01Z

I marked this as an enhancement, but I do think it should be back-ported to 3.1 & 3.2.

heifner · 2022-11-30T21:52:12Z

This currently prevents eosnetworkfoundation/product#127 from propagating a block as quickly as it could without this change. If 127 is implemented so that it does not need the main thread for block header validation then the block can be propagated quickly without this change.

However, even with 127 implemented so it does not need the main thread, this solution is still needed because you do want to start applying the state changes from the incoming block as soon as possible. For a speculative block you want to abort it as soon as you have a new block to apply.

heifner · 2022-12-08T13:44:04Z

Example where this is needed:

Dec  7 07:45:26 xxx nodeos[676220]: info  2022-12-07T07:45:26.419 nodeos    producer_plugin.cpp:502       on_incoming_block    ] Received block 7e7eaf8bed3b3731... #217868155 @ 2022-12-07T07:45:26.500 signed by eosauthority [trxs: 32, lib: 217867829, confirmed: 0, net: 4336, cpu: 5498, elapsed: 4955, time: 5589, latency: -80 ms]
Dec  7 07:45:26 xxx nodeos[676220]: debug 2022-12-07T07:45:26.419 nodeos    producer_plugin.cpp:1685      start_block          ] Starting block #217868156 at 2022-12-07T07:45:26.419 producer eosauthority
Dec  7 07:45:26 xxx nodeos[676220]: debug 2022-12-07T07:45:26.419 nodeos    producer_plugin.cpp:1842      remove_expired_trxs  ] Processed 2 expired transactions of the 3192 transactions in the unapplied queue, Persistent expired 0, Other expired 2
Dec  7 07:45:26 xxx nodeos[676220]: debug 2022-12-07T07:45:26.420 nodeos    subjective_billing.hpp:204    remove_expired       ] Processed 37910 subjective billed transactions, Expired 41
Dec  7 07:45:26 xxx nodeos[676220]: debug 2022-12-07T07:45:26.420 nodeos    producer_plugin.cpp:2281      process_incoming_trx ] Processing 2926 pending transactions
Dec  7 07:45:26 xxx nodeos[676220]: warn  2022-12-07T07:45:26.503 net-9     net_plugin.cpp:2689           process_next_trx_mes ] ["eosn-wax-seed54:9876 - 594c2df" - 42 10.88.141.115:9876] Dropping trx, too many trx in progress 104858348 bytes
Dec  7 07:45:26 xxx nodeos[676220]: warn  2022-12-07T07:45:26.600 net-3     net_plugin.cpp:2689           process_next_trx_mes ] ["405c2c85f3b4:14999 - 6809b41" - 20 51.222.156.169:14998] Dropping trx, too many trx in progress 104857946 bytes
Dec  7 07:45:26 xxx nodeos[676220]: warn  2022-12-07T07:45:26.800 net-8     net_plugin.cpp:2689           process_next_trx_mes ] ["ec1183b43025:9876 - ee0ed8e" - 10 51.222.153.167:35777] Dropping trx, too many trx in progress 104858072 bytes
Dec  7 07:45:26 xxx nodeos[676220]: warn  2022-12-07T07:45:26.815 net-4     net_plugin.cpp:2689           process_next_trx_mes ] ["us-api01.eosams.xeos.me:9101 - 663d936" - 26 207.244.100.59:9101] Dropping trx, too many trx in progress 104857986 bytes
Dec  7 07:45:26 xxx nodeos[676220]: debug 2022-12-07T07:45:26.900 nodeos    producer_plugin.cpp:2306      process_incoming_trx ] Processed 948 pending transactions, 1978 left
Dec  7 07:45:26 xxx nodeos[676220]: debug 2022-12-07T07:45:26.900 nodeos    producer_plugin.cpp:2368      schedule_production_ ] Speculative Block Created
Dec  7 07:45:26 xxx nodeos[676220]: debug 2022-12-07T07:45:26.903 nodeos    producer_plugin.cpp:438       on_incoming_block    ] received incoming block 217868156 0cfc677c8a77f63239c21150ca47e72ad61770af45e5cf1b37c98002d1ad68b8
Dec  7 07:45:26 xxx nodeos[676220]: debug 2022-12-07T07:45:26.903 nodeos    producer_plugin.cpp:187       report               ] Block trx idle: 3457us out of 483979us, success: 6, 1481us, fail: 942, 301972us, other: 177069us

It is very likely that block 217868156 came in before 26.900, but had to wait until then because of the speculative block processing in start_block.

…ative mode. Use deadline when in production mode.

…tion-3.2

…k-propagation-3.2-wax

…uld restart block because speculative block may have been interrupted.

…w the same pattern.

…tion-3.2

…tion-main

…ion-main

[3.1] Interrupt speculative start_block when a block is received

…ropagation-3.2-merge

…merge [3.1 -> 3.2] Interrupt speculative start_block when a block is received

…ropagation-main-merge

…-merge [3.2 -> main] Interrupt speculative start_block when a block is received

enf-ci-bot added the triage label Nov 30, 2022

enf-ci-bot added this to Team Backlog Nov 30, 2022

enf-ci-bot moved this to Todo in Team Backlog Nov 30, 2022

heifner added enhancement New feature or request actionable and removed triage labels Nov 30, 2022

stephenpdeos added the 👍 lgtm label Dec 1, 2022

heifner self-assigned this Dec 6, 2022

heifner added the OCI Work exclusive to OCI team label Dec 6, 2022

heifner moved this from Todo to In Progress in Team Backlog Dec 6, 2022

heifner added a commit that referenced this issue Dec 6, 2022

GH-525 Interrupt speculative start_block when a block is received

0103412

heifner mentioned this issue Dec 6, 2022

[3.1] Interrupt speculative start_block when a block is received #543

Merged

heifner moved this from In Progress to Awaiting Review in Team Backlog Dec 8, 2022

heifner added a commit that referenced this issue Dec 8, 2022

GH-525 Use received block as interrupt for start_block when in specul…

1f44c07

…ative mode. Use deadline when in production mode.

heifner added a commit that referenced this issue Dec 8, 2022

GH-525 Fixed spelling of yield

8027ecb

heifner added a commit that referenced this issue Dec 9, 2022

GH-525 Minor cleanup

e63482a

heifner added a commit that referenced this issue Dec 12, 2022

GH-525 Honor deadline if we can produce

40a7e34

heifner added a commit that referenced this issue Dec 12, 2022

Merge branch 'GH-525-block-propagation-3.1' into GH-525-block-propaga…

4cdbfdc

…tion-3.2

heifner added a commit that referenced this issue Dec 13, 2022

Merge remote-tracking branch 'wax-leap/wax-leap-3.2' into GH-525-bloc…

01c1adb

…k-propagation-3.2-wax

heifner added a commit that referenced this issue Dec 14, 2022

GH-525 Change received block net_plugin log to info for debugging

3beb9f0

heifner added a commit that referenced this issue Dec 14, 2022

GH-525 Make forkdb thread-safe

74366a2

heifner mentioned this issue Dec 14, 2022

[3.2] Interrupt speculative start_block when a block is received #574

Closed

heifner added a commit that referenced this issue Dec 14, 2022

GH-525 Remove unused block channel.

99533be

heifner added a commit that referenced this issue Dec 14, 2022

GH-525 Move scope_exit of schedule_production_loop since any call sho…

b408aa2

…uld restart block because speculative block may have been interrupted.

heifner added a commit that referenced this issue Dec 14, 2022

GH-525 Use shared_mutex to allow multiple readers

a143803

heifner added a commit that referenced this issue Dec 14, 2022

GH-525 Only interrupt start_block for validated block_header.

6f3ad96

heifner added a commit that referenced this issue Dec 15, 2022

GH-525 Make forkdb thread-safe

43d45ed

heifner added a commit that referenced this issue Dec 15, 2022

GH-525 Remove unused block channel.

7468ded

heifner added a commit that referenced this issue Dec 15, 2022

GH-525 Move scope_exit of schedule_production_loop since any call sho…

d0d7fa7

…uld restart block because speculative block may have been interrupted.

heifner added a commit that referenced this issue Dec 15, 2022

GH-525 Use shared_mutex to allow multiple readers

b3391af

heifner added a commit that referenced this issue Dec 15, 2022

GH-525 Only interrupt start_block for validated block_header.

9b1401a

heifner added a commit that referenced this issue Dec 15, 2022

GH-525 Minor cleanup from PR comments.

8f12805

heifner added a commit that referenced this issue Dec 16, 2022

GH-525 Add open_impl/close_impl so all methods of fork_database follo…

453e5d0

…w the same pattern.

heifner added a commit that referenced this issue Dec 16, 2022

GH-525 Remove unused incoming::channels::transaction

9b5ec62

heifner added a commit that referenced this issue Dec 20, 2022

Merge branch 'GH-525-block-propagation-3.1' into GH-525-block-propaga…

95d5620

…tion-3.2

heifner added a commit that referenced this issue Dec 20, 2022

Merge branch 'GH-525-block-propagation-3.2' into GH-525-block-propaga…

fcad3c3

…tion-main

heifner added a commit that referenced this issue Dec 22, 2022

Merge remote-tracking branch 'origin/main' into GH-525-block-propagat…

fa178a9

…ion-main

heifner added a commit that referenced this issue Dec 22, 2022

Merge branch 'GH-525-block-propagation-main' into GH-568-block-propagate

d956b55

heifner added a commit that referenced this issue Jan 10, 2023

Merge branch 'release/3.1' into GH-525-block-propagation-3.1

ef9b3ea

heifner added a commit that referenced this issue Jan 10, 2023

GH-525 Add check for existing to be consistent

3d77eb0

heifner added a commit that referenced this issue Jan 10, 2023

GH-525 rename start_block_interrupted to should_interrupt_start_block

2d6307b

heifner added a commit that referenced this issue Jan 18, 2023

Merge pull request #543 from AntelopeIO/GH-525-block-propagation-3.1

be7fc57

[3.1] Interrupt speculative start_block when a block is received

heifner added a commit that referenced this issue Jan 18, 2023

Merge remote-tracking branch 'origin/release/3.1' into GH-525-block-p…

bce4029

…ropagation-3.2-merge

heifner mentioned this issue Jan 18, 2023

[3.1 -> 3.2] Interrupt speculative start_block when a block is received #648

Merged

heifner added a commit that referenced this issue Jan 18, 2023

Merge pull request #648 from AntelopeIO/GH-525-block-propagation-3.2-…

4107ba4

…merge [3.1 -> 3.2] Interrupt speculative start_block when a block is received

heifner added a commit that referenced this issue Jan 18, 2023

Merge remote-tracking branch 'origin/release/3.2' into GH-525-block-p…

5c1a4f4

…ropagation-main-merge

heifner mentioned this issue Jan 18, 2023

[3.2 -> main] Interrupt speculative start_block when a block is received #649

Merged

heifner closed this as completed in #649 Jan 18, 2023

heifner added a commit that referenced this issue Jan 18, 2023

Merge pull request #649 from AntelopeIO/GH-525-block-propagation-main…

51c1117

…-merge [3.2 -> main] Interrupt speculative start_block when a block is received

github-project-automation bot moved this from Awaiting Review to Done in Team Backlog Jan 18, 2023

bhazzard mentioned this issue Feb 2, 2023

Lighter Validation for Relays eosnetworkfoundation/product#127

Closed

2 tasks

heifner added a commit that referenced this issue Oct 2, 2024

GH-525 Remove dead code

fa7e531

heifner added a commit that referenced this issue Oct 2, 2024

GH-525 Minimum code changes to resolve on reconnect

91881c9

heifner added a commit that referenced this issue Oct 2, 2024

GH-525 Refactor resolve_and_connect so that connection object is reused.

3cc2bb7

heifner added a commit that referenced this issue Oct 2, 2024

GH-525 Used shared_lock for shared_mutex

4eb7700

heifner added a commit that referenced this issue Oct 2, 2024

GH-525 Used std::move

428987c

heifner added a commit that referenced this issue Oct 2, 2024

GH-525 Remove duplicate code

bf853a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Block propagation delay under load #525

Block propagation delay under load #525

heifner commented Nov 30, 2022

heifner commented Nov 30, 2022

heifner commented Nov 30, 2022 •

edited

Loading

heifner commented Dec 8, 2022

Block propagation delay under load #525

Block propagation delay under load #525

Comments

heifner commented Nov 30, 2022

heifner commented Nov 30, 2022

heifner commented Nov 30, 2022 • edited Loading

heifner commented Dec 8, 2022

heifner commented Nov 30, 2022 •

edited

Loading