-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ETCM-186] Fix strict pick with too few blocks #723
Conversation
3c6e02f
to
cb179e9
Compare
80e62ce
to
2c3352f
Compare
2c3352f
to
7c8dd37
Compare
@@ -68,14 +68,17 @@ class BlockFetcher( | |||
private def handleCommands(state: BlockFetcherState): Receive = { | |||
case PickBlocks(amount) => state.pickBlocks(amount) |> handlePickedBlocks(state) |> fetchBlocks | |||
case StrictPickBlocks(from, atLeastWith) => | |||
val minBlock = from.min(atLeastWith).max(1) | |||
log.debug("Strict Pick blocks from {} to {}", from, atLeastWith) | |||
// FIXME: Consider having StrictPickBlocks calls guaranteeing this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally decided not touching this on this PR to minimize the impact it has on our syncing process
It all makes sense to me. Do you think that writing a test in similar fashion to "go back to earlier block in order to find a common parent with new branch" would be too hard to expose that issue? |
This is need it as soon as possible for experimental testnet purposes, so i will merge it as it is. Any future improvement could be do it later. cc @jmendiola222, @kapke |
Description
Fixes branch resolution when triggered at the beginning of the chain, that caused sync to end in irreparable case
Issue
Given 2 nodes
If both have chains of the same weight but forked (top block with number X), when node 1 requests the last blocks from node 2 it won't import any
This step results in node 1 logging:
Imported no blocks
Node 2 extends it chain with several mined blocks
Node 1 asks for blocks starting from block X+1, but as node 2 forked they won't be concatenable
This step results in node 1 logging:
Next request from node 1 will be from an earlier block than X, ideally before the fork, node 1 currently gets stuck here with logging:
How to reproduce
It's very hard to reproduce this automatically in an integration test level (maybe after ETCM-127 it can be easier to include a test)
Change code to:
a. Delay requests for 2 minutes to allow time for mining blocks in between. Replace the fetchHeaders function from BlockFetcher with:
b. Have not every block be broadcasted so as to simplify the whole process, the last blocks should be broadcasted so as to trigger a new fetch. Replace the broadcastBlock function from BlockBroadcast with:
Start at the same time 2 nodes connected to each other
While they are both blocked in their sleep, mine 10 blocks in each
Await for node 1 fetching the 10 blocks from node 2 and failing to import them (with log
Imported no blocks
)Mine 10 blocks on node 2, the broadcasting of them should trigger a re-fetch from 1
That will halt node 1 progress with infinite logs:
Solution
Strict pick
from
value is capped to 1 in case it's lower than it.Our current code wasn't working due to the condition
.filter(_.headOption.exists(block => block.number <= lower))
on strictPickBlocks, which is never true in case lower is a negative number. The latter never happens after capping the from valueTesting
Attempting to reproduce it after the fix should result in node 1 not halting itself and importing the last 10 blocks from node 2
Up to discussion
I'm not sure how the node will handle if resolving branches up to 1000 blocks as configured on the testnet, maybe we should change it to 100 so that it requires a single message for that resolve? Further analysis of the sync process should be done if not