eth/downloader: dynamically move pivot even during chain sync #21529

karalabe · 2020-09-08T20:16:05Z

When we've implemented fast sync, we picked a static pivot block and kept it indefinitely. This worked well because we didn't have state pruning, and any (recent) block we picked, it was sure to remain available until we finished syncing.

Eventually we've implemented in-memory GC, which caused stale state to gradually disappear from the network. Although the head of the state trie was garbage collected quote fast (15 mins), most of the trie remained constant throughout sync. Thus it was fine to pick a static pivot block, keep syncing that, and only dynamically bump it when we've approached the chain head. This could be done via an easy hack where the block collector pushes the pivot forward if it accumulates more than 64 blocks after the old pivot.

Unfortunately, with snap sync, the snapshot belonging to a pivot block completely disappears the instance it gets older than 128 blocks. The effect is that we start syncing a recent pivot, and when it goes stale, we suspend state sync until it gets updated, which won't happen until after we reach the end of the chain. This is a problem because we waste precious time.

This PR fixes the issue by adding an additional pivot staleness detection mechanism. When we are fast syncing the header skeleton, after every batch (~36K fast blocks (no exec)), we do an additional header retrieval, asking for the current pivot and the next one (+64 blocks). If there is a next one returned, we switch state sync to the new one. This way we can detect that the snapshots are imminently disappearing from the network and we should switch to a new state root before that.

Cost wise this extra header retrieval doesn't matter much, we're retrieving on average 1 header / 36K headers imported, so all in all waste about 150KB bandwidth during a full sync of mainnet.

Implementation wise the PR went with locks because there's no coordinator / runner in the downloader, so introducing it + channels would have been quite an invasive change. Unfortunately this meant that we needed to retain both the old mechanism that operated on the result queue and add a new mechanism that operates on header retrievals.

Security wise the dynamic pivot depends on our master peer (it cannot be attacked by non-master peers). If the master peer is malicious, it can feed us an arbitrary starting snapshot. This is at worse a grieving attack because after we're done importing the chain, we'll fast sync on top and at worse fall back to the current performance.

# Node boots up
INFO [09-10|21:12:40.811] Loaded last fast-sync pivot marker       number=7174316 

# Fast sync starts in chain retrieval phase
WARN [09-10|21:43:12.204] Pivot seemingly stale, moving            old=7174426 new=7174490 
WARN [09-10|21:58:59.180] Pivot seemingly stale, moving            old=7174490 new=7174554 

# Chain retrieval completes, we're just waiting for state sync to complete
WARN [09-10|22:17:28.030] Pivot became stale, moving               old=7174554 new=7174618 
WARN [09-10|22:33:27.867] Pivot became stale, moving               old=7174618 new=7174682 
WARN [09-10|22:49:30.755] Pivot became stale, moving               old=7174682 new=7174746

eth/downloader/downloader.go

holiman · 2020-09-09T07:41:22Z

eth/downloader/downloader.go

+					headers := packet.(*headerPack).headers
+
+					log.Warn("Pivot seemingly stale, moving", "old", pivot, "new", headers[0].Number)
+					pivot = headers[0].Number.Uint64()


At this point, any two headers received will cause us to do WriteLastPivotNumber to db. I'm thinking this might lead to some internal assumptions / corruption failing if, for example, a peer moves the pivot back to genesis, or block 5...?
If we become moved back to genesis, then we're done ..?

So shouldn't we also check that the returned headers match the numbers we requested?

Hmm, yeah. Problem is that checking the number is not particularly useful. I could just give you the correct number and junk root hash. We could add at least PoW verification (beside number), but that will not work on Clique networks (acceptable?).

Checked the header numbers in this PR for now. PoW checks would need to push the consensus engine into the downloader too.

karalabe · 2020-09-11T06:58:10Z

@holiman Issues addressed, PTAL

eth/downloader/downloader.go

holiman

LGTM, some minor questions/comments

karalabe added this to the 1.9.21 milestone Sep 8, 2020

karalabe requested review from holiman and rjl493456442 as code owners September 8, 2020 20:16

karalabe modified the milestones: 1.9.21, 1.9.22 Sep 8, 2020

holiman reviewed Sep 9, 2020

View reviewed changes

karalabe force-pushed the dynamic-pivot branch from 85a9911 to 4f2942a Compare September 10, 2020 20:39

karalabe force-pushed the dynamic-pivot branch 4 times, most recently from 9986463 to ec54ac4 Compare September 17, 2020 09:39

holiman reviewed Sep 18, 2020

View reviewed changes

eth/downloader/downloader.go Outdated Show resolved Hide resolved

holiman reviewed Sep 18, 2020

View reviewed changes

eth/downloader/downloader.go Show resolved Hide resolved

holiman approved these changes Sep 18, 2020

View reviewed changes

eth/downloader: dynamically move pivot even during chain sync

fb835c0

karalabe force-pushed the dynamic-pivot branch from ec54ac4 to fb835c0 Compare September 18, 2020 08:38

karalabe merged commit 2482ba0 into ethereum:master Sep 18, 2020

quorumbot mentioned this pull request Jun 15, 2021

[Upgrade] Go-Ethereum release v1.9.22 Consensys/quorum#1213

Closed

10 tasks

This was referenced Jun 15, 2021

[Upgrade] Go-Ethereum release v1.9.22 baptiste-b-pegasys/quorum#10

Closed

[Upgrade] Go-Ethereum release v1.9.22 baptiste-b-pegasys/quorum#11

Closed

quorumbot mentioned this pull request Jun 15, 2021

[Upgrade] Go-Ethereum release v1.9.22 Consensys/quorum#1214

Merged

9 tasks

piersy mentioned this pull request Oct 14, 2021

Upstream merge up to 1.9.25 celo-org/celo-blockchain#1671

Merged

jeongkyun-oh mentioned this pull request Jun 22, 2022

dynamically move pivot even during chain sync klaytn/klaytn#1454

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eth/downloader: dynamically move pivot even during chain sync #21529

eth/downloader: dynamically move pivot even during chain sync #21529

karalabe commented Sep 8, 2020 •

edited

Loading

holiman Sep 9, 2020

karalabe Sep 10, 2020

karalabe Sep 11, 2020

karalabe commented Sep 11, 2020

holiman left a comment

eth/downloader: dynamically move pivot even during chain sync #21529

eth/downloader: dynamically move pivot even during chain sync #21529

Conversation

karalabe commented Sep 8, 2020 • edited Loading

holiman Sep 9, 2020

Choose a reason for hiding this comment

karalabe Sep 10, 2020

Choose a reason for hiding this comment

karalabe Sep 11, 2020

Choose a reason for hiding this comment

karalabe commented Sep 11, 2020

holiman left a comment

Choose a reason for hiding this comment

karalabe commented Sep 8, 2020 •

edited

Loading