Client/Block: stabilize block fetcher #3240

jochem-brouwer · 2024-01-18T04:41:17Z

This PR attempts to stabilize the ReverseBlockFetcher.

block.validateData takes a lot of time. This means that once we request big blocks, this validation loop will block the event queue. This means that the fetcher job actually expires and the results are not written.

A related problem is that the validation loop also blocks network io loops: (this happens after this PR still)

TRACE[01-18|05:34:44.616] Failed RLPx handshake addr=127.0.0.1:30303 conn=staticdial err="read tcp 127.0.0.1:42862->127.0.0.1:30303: i/o timeout"

Note: the ReverseBlockFetcher writes to the skeleton chain, which itself does not re-perform block validation. However, in BlockFetcher if one stores the blocks, if you write to chain.blockchain (so a blockchain object, not skeleton), this will internally re-validate the block.validateData(). So, for BlockFetcher the block is actually verified twice! Once upon request, and once upon storing it. So, we can safely (?) remove this from the BlockFetcher.

Still WIP, but this actually got my client un-stuck! I got from to 655624 (tail block) in about 10 minutes. Before this, I would always get stuck and stay at 656724 since the fetcher would expire my jobs.

Side question: is there a way to put the validateData job at the end of the nodejs event queue? If we can do this, it means we validate one block, then reply to the devp2p messages (ping messages for instance), and are also open for sockets to receive other jobs (this is the error listed above: Geth tries to write to our socket but since we do not read, it will timeout at some point). I tried setImmediate, process.next, setTimeout but this does not seem to work.

Thoughts? (Try this locally too!! It should get your client unstuck 😄 )

This PR does 3 things (can cherry-pick them out):

Speeds up the BlockFetcher by only verifying the data integrity, not checking if each tx is signed (motivation is in the comments)
Adds the verifyData(onlyHeader: boolean = false, verifyTxs: boolean = true) the verifyTxs parameter to block. Setting this to false skips tx validation.
(temporarily) updates the BeaconSync best() method to always return a peer if one is available

jochem-brouwer · 2024-01-18T04:42:22Z

packages/client/src/sync/fetcher/reverseblockfetcher.ts

    _task: JobTask
  ): { destroyFetcher: boolean; banPeer: boolean; stepBack: bigint } {
    const stepBack = BIGINT_0
-    const destroyFetcher = !(error.message as string).includes(
-      `Blocks don't extend canonical subchain`


@g11tech what is the reason to destroy the fetcher if there is another error? Not sure here 🤔

this is the only error that is expected if the peer didn't give you correct chain, for all other errors something wrong went in the fetcher and hence fetcher needs to be cleared out and a new fetcher will be restarted as a means to be robust against issues. why do you want to remove it?

So if now validation errors can come in, means that we also expect those and not destroy the fetcher (which will just lead to re-queing of the job)

I should study the Fetcher stack a bit more before I can enlighten :) I read somewhere that if you destroy the fetcher it is a "critical" error, I would assume that then the fetcher is broken or something. Will study it some more and will get back to this later.

yes, so fetcher will be/should be reinitiated (if our peer latest/best is correct or we handle it in better way)

g11tech · 2024-01-18T11:12:01Z

packages/client/src/sync/fetcher/blockfetcher.ts

@@ -86,7 +86,6 @@ export class BlockFetcher extends BlockFetcherBase<Block[], Block> {
      }
      // Supply the common from the corresponding block header already set on correct fork
      const block = Block.fromValuesArray(values, { common: headers[i].common })
-      await block.validateData()


i think its ok to not validate data if this is PoS because of parent-child relationship validation that will happen while storing in skeleton, so we can do an if here

I think I agree. For the backfill process we can (especially for reverse block fetcher) just accept the blocks - right? If we validate that the reported blocks has the block hash which we expect in the end, then we can accept it (and we dont have to validate all data such as: does every txn have a valid signature? Since if the tx trie matches the block hash, we know that CL expects us that this block is valid? If it is unvalid then CL is broken since it gives us the wrong chain tip block)

yes, just validating the hash here is good enough

codecov · 2024-01-19T21:55:17Z

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (68c4fb9) 87.84% compared to head (3efbd77) 87.88%.

Additional details and impacted files

Flag	Coverage Δ
block	`88.57% <75.00%> (+0.96%)`	⬆️
blockchain	`91.60% <ø> (ø)`
client	`84.56% <100.00%> (+<0.01%)`	⬆️
common	`98.26% <ø> (ø)`
devp2p	`82.12% <ø> (ø)`
ethash	`∅ <ø> (∅)`
evm	`76.92% <ø> (ø)`
genesis	`99.98% <ø> (ø)`
rlp	`∅ <ø> (∅)`
statemanager	`86.57% <ø> (ø)`
trie	`89.67% <ø> (+0.28%)`	⬆️
tx	`95.89% <ø> (ø)`
util	`89.13% <ø> (ø)`
vm	`80.26% <ø> (ø)`
wallet	`88.35% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

holgerd77 · 2024-01-22T11:17:45Z

Updated this via UI

What is the latest state here? Ready for review respectively can likely @g11tech have a final look? Or not yet?

jochem-brouwer · 2024-01-22T20:21:24Z

From my side this is ready for review :)

acolytec3 · 2024-01-23T18:54:04Z

packages/client/src/sync/fetcher/blockfetcher.ts

+      // Upon putting blocks into blockchain (for BlockFetcher), `validateData` is called again
+      // In ReverseBlockFetcher we do not need to validate the entire block, since CL
+      // expects us to sync with the requested chain tip header
+      await block.validateDataIntegrity()


Is it really necessary to add a wrapper function that's only used in one place? Feels like we should just call block.verifyData and then in the comments explan why we're not validating transactions.

Do not have a very strong opinion here but would cautiously agree

(Or is there a somewhat strong reasoning (which might be) that the API is better off with (yet) another explicit validation method?)

Not really a strong reason to do this. I will remove the method and directly call into verifyData!

Have addressed

holgerd77

Thanks for the updates, LGTM!

client: ensure block validation happens in storing results

990f1d6

jochem-brouwer added PR state: WIP package: client labels Jan 18, 2024

jochem-brouwer commented Jan 18, 2024

View reviewed changes

g11tech reviewed Jan 18, 2024

View reviewed changes

jochem-brouwer added 5 commits January 19, 2024 22:27

client: beacon sync: ensure always peer returned

679a23b

client: block fetcher: undo destroyFetcher change

55f3b4c

block: add validateDataIntegrity

d583cbb

client: update block fetchers

5fe1b40

Merge branch 'master' into stabilize-fetcher

c350caa

jochem-brouwer and others added 4 commits January 20, 2024 03:04

Merge branch 'master' into stabilize-fetcher

d32ca6f

block: ensure sig txs only checked when verifyTxs=true

216c662

Merge branch 'master' into stabilize-fetcher

d34b2a0

Merge branch 'master' into stabilize-fetcher

8c5afa1

jochem-brouwer added PR state: needs review and removed PR state: WIP labels Jan 22, 2024

Merge branch 'master' into stabilize-fetcher

931a329

acolytec3 reviewed Jan 23, 2024

View reviewed changes

acolytec3 and others added 5 commits January 23, 2024 15:07

Merge branch 'master' into stabilize-fetcher

d76b039

Merge branch 'master' into stabilize-fetcher

2c83743

Merge branch 'master' into stabilize-fetcher

798c169

Merge branch 'master' into stabilize-fetcher

4328698

address review

3efbd77

holgerd77 approved these changes Jan 29, 2024

View reviewed changes

holgerd77 merged commit 6ca6fd5 into master Jan 29, 2024
45 of 46 checks passed

holgerd77 deleted the stabilize-fetcher branch January 29, 2024 11:35

jochem-brouwer added the package: block label Jan 30, 2024

holgerd77 changed the title ~~Client: stabilize block fetcher~~ Client/Block: stabilize block fetcher / new Block.validateData() API Method Jan 30, 2024

holgerd77 changed the title ~~Client/Block: stabilize block fetcher / new Block.validateData() API Method~~ Client/Block: stabilize block fetcher Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client/Block: stabilize block fetcher #3240

Client/Block: stabilize block fetcher #3240

jochem-brouwer commented Jan 18, 2024 •

edited

Loading

jochem-brouwer Jan 18, 2024

g11tech Jan 18, 2024 •

edited

Loading

jochem-brouwer Jan 18, 2024

g11tech Jan 19, 2024

g11tech Jan 18, 2024

jochem-brouwer Jan 18, 2024

g11tech Jan 19, 2024

codecov bot commented Jan 19, 2024 •

edited

Loading

holgerd77 commented Jan 22, 2024

jochem-brouwer commented Jan 22, 2024

acolytec3 Jan 23, 2024

holgerd77 Jan 24, 2024

holgerd77 Jan 24, 2024 •

edited

Loading

jochem-brouwer Jan 26, 2024

jochem-brouwer Jan 27, 2024

holgerd77 left a comment

Client/Block: stabilize block fetcher #3240

Client/Block: stabilize block fetcher #3240

Conversation

jochem-brouwer commented Jan 18, 2024 • edited Loading

Choose a reason for hiding this comment

g11tech Jan 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jan 19, 2024 • edited Loading

Codecov Report

holgerd77 commented Jan 22, 2024

jochem-brouwer commented Jan 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

holgerd77 Jan 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

holgerd77 left a comment

Choose a reason for hiding this comment

jochem-brouwer commented Jan 18, 2024 •

edited

Loading

g11tech Jan 18, 2024 •

edited

Loading

codecov bot commented Jan 19, 2024 •

edited

Loading

holgerd77 Jan 24, 2024 •

edited

Loading