-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commit phase - AKKA Model #422
Conversation
At this moment, the unique different line is this, the lock: https://github.com/neo-project/neo/pull/422/files#diff-0285c1c12d1d492897a99ffe07f9fed9R77 I am working on recover the consensus state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shargon brodas,
I saw that change view is now being blocked with if (context.State.HasFlag(ConsensusState.CommitSent)) return
at CheckExpectedView
method.
If the comments I mentioned are applied, then, we would send a partial signature and we would need Two Counters now (context.SignaturesPartial
and context.Signatures
).
After veryfing that context.SignaturesPartial
is fulfilled at line
neo/neo/Consensus/ConsensusService.cs
Line 87 in 064ebc9
if (!context.State.HasFlag(ConsensusState.CommitSent) && |
Then, the CN would enter in a loop waiting for the real signatures context.Signatures
.
For me, the point is: The same problem of the "fork" can happen here, some CN entering in the BlockChangeView (commit phase) and others not.
However, there should be a mechnism in which Good Nodes would be able to also enter commit phase, for this purpose, the nodes in the Commit Phase (Blocking the view) should Re-Send their Partial Signatures in order that good nodes (that possible entered in a loop due to connections looses) will occasionally join the commit phase waiting for the Real Signatures.
@@ -82,8 +84,30 @@ private void CheckExpectedView(byte view_number) | |||
|
|||
private void CheckSignatures() | |||
{ | |||
if (!context.State.HasFlag(ConsensusState.CommitSent) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shargon,
I believe that here we should check the partial signatures instead if the complete and, then, at lines 353 (
neo/neo/Consensus/ConsensusService.cs
Line 353 in 064ebc9
context.Signatures[context.MyIndex] = context.MakeHeader().Sign(context.KeyPair); |
neo/neo/Consensus/ConsensusService.cs
Line 52 in 064ebc9
context.Signatures[context.MyIndex] = context.MakeHeader().Sign(context.KeyPair); |
I think that the priority should be fix the network errors. Obiously we need to find the right way to fix the malicious CN who can produce a fork, but i think ... if we send the complete signatures on the commit, we are doing the same as without this patch, i need to think about it 🤔 |
I also agree that the priority is to fix the network errors. |
good job @shargon, it's working on private net.
Now, we have to decide if is better to lock change view on commit phase.
Also, as @vncoelho said, partial signature is important here. |
Hey, @belane, Jaime, I prefer 2. instead of 1.
aheuaheuahuea Let's evaluate and think about this with careful, maybe there also can be a problem...Crazy and Smart minds everywhere, we never know where they gonna find a possible entrance 📦 |
But another discussion is: merge the critical part and later think about this or try to add this additional important change now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shargon, fast as a bolt!
The first commit looks precise to me.
The second one (f0c2ada) I do not know, because I thought that the CheckExpectedView
was correctly implemented for blocking the ChangeView of those nodes with commit already sent. I think that you do not need to re-send here, just keep that same return
for blocking changeview.
Now, the challenges are:
- Create a mechanism for accepting the incoming re-send (at OnPrepareResponse). Perhaps, including an exception here:
neo/neo/Consensus/ConsensusService.cs
Line 248 in f0c2ada
if (message.ViewNumber != context.ViewNumber && message.Type != ConsensusMessageType.ChangeView) - Change the complete signature at line
neo/neo/Consensus/ConsensusService.cs
Line 52 in f0c2ada
context.Signatures[context.MyIndex] = context.MakeHeader().Sign(context.KeyPair);
context.Signatures[context.MyIndex] = context.MakeHeader().Sign(context.KeyPair);
context.SignaturesPartial[context.MyIndex] = context.SOMETHINGTOTHINGABOUT.Sign(context.KeyPair);
neo/Consensus/ConsensusService.cs
Outdated
// If signature was sent, we send again | ||
|
||
SignAndRelay(context.MakePrepareResponse(context.Signatures[context.MyIndex])); | ||
CheckSignatures(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe remove this CheckSignatures
here because the node is just re-sending. It should keep Checking at OnPrepareResponseReceived
which also filters messages from nodes with wrong view and etc...
Still some repeated logs.. we must see that:
|
@@ -13,5 +13,6 @@ internal enum ConsensusState : byte | |||
SignatureSent = 0x10, | |||
BlockSent = 0x20, | |||
ViewChanging = 0x40, | |||
CommitSent = 0x80, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct 0x80.
|
||
private void OnCommitAgreement(ConsensusPayload payload, CommitAgreement message) | ||
{ | ||
Log($"{nameof(OnCommitAgreement)}: height={payload.BlockIndex} hash={message.BlockHash.ToString()} view={message.ViewNumber} index={payload.ValidatorIndex}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put this message Log($"{nameof(OnCommitAgreement)}: ...
after the if
? So it won't be repeated.
Please, do that to Log($"{nameof(OnPrepareResponseReceived)}: ...
too, because it's also repeated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shargon This is why messages are repeated, easy to fix.
@igormcoelho try again with 14a31e6 please :) @vncoelho i removed the CheckSignatures like you said on 6b23061 |
@@ -233,15 +275,22 @@ private void OnPersistCompleted(Block block) | |||
|
|||
private void OnPrepareRequestReceived(ConsensusPayload payload, PrepareRequest message) | |||
{ | |||
if (context.State.HasFlag(ConsensusState.RequestReceived)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good move! Avoiding this repeated message.
if (context.State.HasFlag(ConsensusState.BlockSent)) return; | ||
if (context.Signatures[payload.ValidatorIndex] != null) return; | ||
|
||
Log($"{nameof(OnPrepareResponseReceived)}: height={payload.BlockIndex} view={message.ViewNumber} index={payload.ValidatorIndex}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to filter before RequestView
@@ -360,6 +411,20 @@ public static Props Props(NeoSystem system, Wallet wallet) | |||
|
|||
private void RequestChangeView() | |||
{ | |||
if (context.State.HasFlag(ConsensusState.CommitSent)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove this and filter before, @shargon.
else if ((context.State.HasFlag(ConsensusState.Primary) && context.State.HasFlag(ConsensusState.RequestSent)) || (context.State.HasFlag(ConsensusState.Backup) && !context.State.HasFlag(ConsensusState.CommitSent)) )
at line
neo/neo/Consensus/ConsensusService.cs
Line 384 in 6b23061
else if ((context.State.HasFlag(ConsensusState.Primary) && context.State.HasFlag(ConsensusState.RequestSent)) || context.State.HasFlag(ConsensusState.Backup)) |
neo/Consensus/ConsensusService.cs
Outdated
{ | ||
// If signature was sent, we send again | ||
|
||
SignAndRelay(context.MakePrepareResponse(context.Signatures[context.MyIndex])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you need to Relay your CommitAgreement again too. Not sure. Or is it implicit by PrepareResponse message?
Shargon, it's looking good for me! I'll try to simulate forking scenarios again, to make sure it's 100%... but congratulations in advance ;) |
neo/Consensus/ConsensusService.cs
Outdated
{ | ||
// If signature was sent, we send again | ||
|
||
SignAndRelay(context.MakePrepareResponse(context.Signatures[context.MyIndex])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shargon ,if a node (in commit phase) receives change view request, and re-send prepareResponse. From current code, the node (who have sent the changeView request ) will not accept prepareResponse because the viewNumber does not match. We may need to find a way to accept the re-send prepareResponse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is just for dropped packets (network errors), maybe there are a disconnection, and this packet never arrive to your CN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. Imagine one CN(A) sent ChangeView request to other CN if it did not collect enough PrepareResponse on time because the network error. Then A increases its view number. But when other CN(B) receives the ChangeView message, it resends prepareResponse because B is in commit phase now.
In this situation, even if A re-connect with B, A will not accept B's resent prepareResponse message because view numbers do not match anymore. But we need to let A accepts this message and move on to commit phase as B, right ?
neo/Consensus/ConsensusService.cs
Outdated
if (context.State.HasFlag(ConsensusState.BlockSent) || | ||
!context.TryToCommit(payload, message)) return; | ||
|
||
Log($"{nameof(OnCommitAgreement)}: height={payload.BlockIndex} hash={message.BlockHash.ToString()} view={message.ViewNumber} index={payload.ValidatorIndex}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I close this for focus on #426 |
I would prefer a smaller pr instead of a big pr with many changes. |
Hey @erikzhang, I think that NGD is already testing the other PR and we also have been testing it for a couple of days.
The PR is a sum of several contributions that all of us have been discussing in the past months. |
Is the same implementation as #320
TODO:
Fixes #193
Fixes neo-project/neo-node#219