Recover nodes requesting ChangeView when possible #579

jsolman · 2019-01-30T00:36:49Z

Recovery messages can be used in response to ChangeView messages in order to bring nodes back to the current state of the consensus. This is achieved safely by keeping the payloads from the network messages and including pertinent parts along with the witness invocation scripts in a recovery message. The recovery messages facilitate regenerating the network messages and reprocessing them.

The recovery is all encompassing, including:

Replaying ChangeView messages
Replaying the PrepareRequest mesage if it was present
Replaying PrepareResponse messages were present.
Replaying the Commit messages.

Considering n = 3f + 1, only f nodes generally must respond to the ChangeView message with the recovery message; the exception is in the case that a node has their CommitSent flag set; in which case they will always reply to ChangeView messages with a recovery message.

Nodes that have committed CommitSent are not allowed to change views.

Some complications arise from needing to accept a previously generated prepare request if the primary is being recovered when coming from another view, or the same view having not yet sent the prepare request. In these cases the primary will act as a accept its own previously generated prepare request, but won't send any prepare response.

Since ChangeView may result in recovery messages being sent each time they are received, it is necessary to prevent replay of such messages from generating a duplicate recovery message response to prevent the potential for DDOS. This is achieved by causing ChangeView messages to always generate a new timestamp so each request will have a unique hash. In this way ChangeView messages are only replied to once by broadcasting a recovery message.

The code has been tested for a number of scenarios and is ready for thorough review and subsequent merge.

Here are some of the cases I tested:

Repeatedly restoring view when no prepare request sent
- kill off all nodes except 1 that isn't the primary; bring back up 1 node at a time while killing previously brought up node each time -- validated view is always restored on nodes
Restore the primary's view with no pepare request sent and have it generate the prepare request
Kill the primary after having sent the prepare request and bring him back up with only other node up and verify that the primary restores and accepts his own previously sent prepare request
Restore node to a view with prepare request sent, but not enough recovery messages to commit
Restore node to a view with prepare request sent and enough messages to commit, verify restore and commit
Restore node from a higher view number on it's change view to a lower number that has commit set and verify it restores to the lower view and commits

Closes #426

jsolman · 2019-01-31T03:17:16Z

By keeping the signatures from the change view that moved to the current view, we will be able to allow nodes to always accept regeneration messages to a newer view. I'm working on making that change now.

…ionScripts.

…n regeneration.

…mber.

…en accepting regeneration.

…end the regenration message in response to change view.

… same view.

neo/Consensus/ConsensusService.cs

…he current.

vncoelho

Nice, Jeff, It looks like going in the precise and right direction.
I still did not revise it, just looked at the main variables.

neo/Consensus/ConsensusContext.cs

…ngeView

…issue.

jsolman · 2019-01-31T13:41:34Z

@erikzhang May want to take a look at b938f96 . In some testing where I had killed one CN node after sending commit, it got stuck on start-up after attempting once to send the commit. May want to cherry-pick b938f96 directly from here into consensus/improved_dbft if you like it.

neo/Consensus/ConsensusService.cs

…a network issue..

neo/Consensus/RecoveryMessage.cs

jsolman · 2019-01-31T14:22:56Z

Since the regeneration message knows the change view witness signatures, it should be possible to send recovery even if the PrepareRequest wasn't sent. If the nodes were able to agree on a view to move to, it needs to recover other nodes to that view also, it may even be able to recover the now that will be the primary and greatly help speed up consensus.

…/neo-project/neo into consensus/regenerateOnChangeView

jsolman · 2019-02-16T06:39:58Z

My last commit message should have said Remove unused private member variable

jsolman · 2019-02-16T07:03:43Z

I updated the top overview again.

neo/Consensus/ConsensusContext.cs

jsolman · 2019-02-16T07:17:43Z

Oops didn't mean to delete your comment. I agree with the change from 3c801dc

neo/Consensus/ConsensusService.cs

…ecovery message.

jsolman · 2019-02-16T07:45:05Z

It's looking pretty clean. Shall I give it one more round of testing now?

neo/Consensus/ConsensusService.cs

* Add commit phase to consensus algorithm (#534) * Add commit phase to consensus algorithm * fix tests * Prevent repeated sending of `Commit` messages * RPC call gettransactionheight (#541) * getrawtransactionheight Nowadays two calls are need to get a transaction height, `getrawtransaction` with `verbose` and then use the `blockhash`. Other option is to use `confirmations`, but it can be misleading. * Minnor fix * Shargon's tip * modified * Allow to use the wallet inside a RPC plugin (#536) * Clean code * Clean code * Minor fix on mempoolVerified * Add MemoryPool Unit tests. Fix bug on initital start of Persisting the Genesis block. * Prevent `ConsensusService` from receiving messages before starting (#573) * Prevent `ConsensusService` from receiving messages before starting * fixed tests - calling OnStart now * Consensus recovery log (#572) * Pass store to `ConsensusService` * Implement `ISerializable` in `ConsensusContext` * Start from recovery log * Fix unit tests due to constructor taking the store. * Add unit tests for serializing and deserializing the consensus context. * Combine `ConsensusContext.ChangeView()` and `ConsensusContext.Reset()` * Add `PreparationHash` field to `PrepareResponse` to prevent replay attacks from malicious primary (#576) * Fixed a problem where `PrepareResponse.PreparationHash` was not assigned. * Load context from store only when height matches * Recover nodes requesting ChangeView when possible (#579) * Fixes bug in `OnPrepareRequestReceived()` * Send `RecoveryMessage` only when `message.NewViewNumber <= context.ViewNumber` * Fix and optimize view changing (#590) * Allow to ignore the recovery logs * Add `isRecovering` (#594) * Fix accepting own prepare request (#596) * Pick some changes from #575. * Fixes `Prefixes` * Restore transactions from saved consensus context (#598) * Refactoring * AggressiveInlining (#606) * Reset Block reference when consensus context is initialized after block persist. (#608) * Change `ConsensusPayload` for compatibility (#609)

* update consensus fix minor issues and change the formatting * minor issues

jsolman added 2 commits January 29, 2019 16:22

Initial cut at regeneration.

dbb5454

Clean up.

121eebe

jsolman requested a review from erikzhang January 30, 2019 00:36

jsolman added 9 commits January 30, 2019 19:31

Fix missing set of PreparationWitnessInvocationScripts.

ed3027c

Fix missing initialize, save and restore of PreparationWitnessInvocat…

96a02a9

…ionScripts.

Fix serialize/deserialize.

f760cc8

Fix ConsensusContext serialize/deserialize and add unit test.

fbf3f60

Add support for tracking ChangeViewWitnessInvocationScripts and use i…

301f780

…n regeneration.

Primary regenerates anyone trying to change view from a lower view nu…

8a78768

…mber.

Restore expected view state and ChangeViewWitnessInvocationScripts wh…

7b1b02b

…en accepting regeneration.

Allow any of the CN nodes that have received the prepare request to s…

7cc1f92

…end the regenration message in response to change view.

Obtain more Perparations from regeneration message if received in the…

7a4c8df

… same view.

jsolman mentioned this pull request Jan 31, 2019

Restore MemoryPool transactions from saved consensus context. #575

Closed

Remove unused using statement.

f8d5804

erikzhang reviewed Jan 31, 2019

View reviewed changes

neo/Consensus/ConsensusService.cs Outdated Show resolved Hide resolved

jsolman and others added 6 commits January 31, 2019 00:35

Fix missing fields when maing the RegenerationMessage.

1efc09b

Rename the RegenerationMessage to RecoveryMessage.

fd2b08a

Load context from store only when height matches

b3125c5

OnChangeView only triggers recovery if requesting a lower view than t…

cc616d1

…he current.

Fix bug in RegenerateSingedPayload method.

0d64749

Minor adjustment so unit tests pass.

f62331d

vncoelho reviewed Jan 31, 2019

View reviewed changes

neo/Consensus/ConsensusContext.cs Outdated Show resolved Hide resolved

neo/Consensus/ConsensusContext.cs Outdated Show resolved Hide resolved

erikzhang and others added 2 commits January 31, 2019 21:32

Merge branch 'consensus/improved_dbft' into consensus/regenerateOnCha…

ba05cef

…ngeView

If committed, resend the commit periodically in case of a networking …

b938f96

…issue.

erikzhang reviewed Jan 31, 2019

View reviewed changes

neo/Consensus/ConsensusService.cs Outdated Show resolved Hide resolved

Whenever the commit is sent reset the timer for resending in case of …

50d1f9f

…a network issue..

erikzhang reviewed Jan 31, 2019

View reviewed changes

neo/Consensus/RecoveryMessage.cs Outdated Show resolved Hide resolved

erikzhang and others added 3 commits February 16, 2019 14:35

Make sure CommitMessages is not null

f8f8504

Remove unused local.

cee538f

Merge branch 'consensus/regenerateOnChangeView' of https://github.com…

2b80c0a

…/neo-project/neo into consensus/regenerateOnChangeView

jsolman and others added 3 commits February 15, 2019 22:43

Minor clean-up reducing nesting.

fa7989f

Remove unnecessary null checks

56edc43

Set ChangeViewPayloads in MakeChangeView()

a031c7a

erikzhang added 2 commits February 16, 2019 15:03

Remove unnecessary null checks

6d88860

Should not set ResponseSent flag if we are the primary

3c801dc

jsolman commented Feb 16, 2019

View reviewed changes

neo/Consensus/ConsensusContext.cs Show resolved Hide resolved

neo-project deleted a comment from erikzhang Feb 16, 2019

Change the fields order of IConsensusContext

d25a682

jsolman commented Feb 16, 2019

View reviewed changes

neo/Consensus/ConsensusService.cs Outdated Show resolved Hide resolved

erikzhang and others added 2 commits February 16, 2019 15:25

Set PreparationPayloads in MakePrepareResponse()

90ff0c5

Relocate methods that get payloads form the recovery message to the r…

9926b74

…ecovery message.

erikzhang reviewed Feb 16, 2019

View reviewed changes

neo/Consensus/ConsensusService.cs Show resolved Hide resolved

jsolman and others added 2 commits February 15, 2019 23:51

Check ViewChanging below.

2b0eb19

Code cleanup

58150aa

erikzhang approved these changes Feb 16, 2019

View reviewed changes

jsolman merged commit be37423 into consensus/improved_dbft Feb 16, 2019

erikzhang added this to the NEO 3.0 milestone Feb 16, 2019

erikzhang deleted the consensus/regenerateOnChangeView branch February 16, 2019 08:14

vncoelho mentioned this pull request Feb 16, 2019

Optimized Delegated Byzantine Fault Tolerance (ODBFT) - Part I: commit phase + regeneration strategy + minnor message p2p route optimizing #426

Closed

vncoelho reviewed Feb 16, 2019

View reviewed changes

neo/Consensus/ConsensusService.cs Show resolved Hide resolved

neo/Consensus/ConsensusService.cs Show resolved Hide resolved

jsolman mentioned this pull request Mar 18, 2019

Minor additional flag when processing OnChangeViewReceived #641

Merged

Thacryba pushed a commit to simplitech/neo that referenced this pull request Feb 17, 2020

Enhance consensus (neo-project#579)

8657429

* update consensus fix minor issues and change the formatting * minor issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recover nodes requesting ChangeView when possible #579

Recover nodes requesting ChangeView when possible #579

jsolman commented Jan 30, 2019 •

edited

Loading

jsolman commented Jan 31, 2019

vncoelho left a comment

jsolman commented Jan 31, 2019

jsolman commented Jan 31, 2019

jsolman commented Feb 16, 2019

jsolman commented Feb 16, 2019

jsolman commented Feb 16, 2019

jsolman commented Feb 16, 2019

Recover nodes requesting ChangeView when possible #579

Recover nodes requesting ChangeView when possible #579

Conversation

jsolman commented Jan 30, 2019 • edited Loading

jsolman commented Jan 31, 2019

vncoelho left a comment

Choose a reason for hiding this comment

jsolman commented Jan 31, 2019

jsolman commented Jan 31, 2019

jsolman commented Feb 16, 2019

jsolman commented Feb 16, 2019

jsolman commented Feb 16, 2019

jsolman commented Feb 16, 2019

jsolman commented Jan 30, 2019 •

edited

Loading