fix: fix sync relayer collaboration #4562
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
HeaderProcess
uses the current snapshot to process the one-time message, During the entire processing process, the changes in the db itself are imperceptible. Normally, there is no problem here. But during this process, if a compact block message is received and processed successfully, the chain inserts a new block, and the block's header is in the previous header process message, there will be a concurrency problem.I use a simple diagram to describe the time series problem here:
The time series here is that the header process holds the old snapshot longer than the compact block process completion time.
This will cause the block status in the old snapshot to differ from the actual status.
ckb/sync/src/types/mod.rs
Lines 2315 to 2335 in f3efbf2
The actual status should be
BLOCK_VALID
, but because the block cannot be seen in the old snapshot, the returned status will beUNKNOWN
ckb/sync/src/synchronizer/headers_process.rs
Lines 290 to 336 in f3efbf2
The
UNKNOWN
status will cause the header to be reinserted into the header map, which will cause the block's status to becomeHEADER_VALID
, the compact blocks received later will all be put into the orphan block pool due to this abnormal state. This will cause synchronization to pause until no compact block messages are received for a while. The sync protocol can then resume synchronization by re-downloading the abnormal block.This PR forcefully reset the snapshot in get block status to obtain the latest status
Check List
Tests
Release note