Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix consensus #425

Merged
merged 3 commits into from
Oct 27, 2018
Merged

fix consensus #425

merged 3 commits into from
Oct 27, 2018

Conversation

erikzhang
Copy link
Member

No description provided.

@erikzhang erikzhang requested a review from shargon October 24, 2018 16:58
@vncoelho
Copy link
Member

vncoelho commented Oct 24, 2018

This fix is more related to avoiding communication after block is sent and verification of signatures, not so much related to the commit phase we have been working, right?
Good move.

igormcoelho
igormcoelho previously approved these changes Oct 25, 2018
Copy link
Contributor

@igormcoelho igormcoelho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes! Seem to allow PrepareResponse to arrive prior to PrepareRequest, which happens a lot in practice. Usually timer used to finish together with block relay too, good fix... now with Akka I haven't seen anymore.

@shargon
Copy link
Member

shargon commented Oct 25, 2018

@erikzhang could you explain what happend here? why the necesity of review the signatures?

byte[] hashData = context.MakeHeader().GetHashData();
if (!Crypto.Default.VerifySignature(hashData, message.Signature, context.Validators[payload.ValidatorIndex].EncodePoint(false))) return;
for (int i = 0; i < context.Signatures.Length; i++)
if (context.Signatures[i] != null)
if (!Crypto.Default.VerifySignature(hashData, context.Signatures[i], context.Validators[i].EncodePoint(false)))
context.Signatures[i] = null;

@vncoelho
Copy link
Member

vncoelho commented Oct 25, 2018

What if a normal RPC inject this, Shargon?
I think that this verification ensures that.

@igormcoelho
Copy link
Contributor

I think its not reviewing signatures shargon, its allowing to.keep sigs from PrepReponse if PrepRequest arrives later.. the previous code was dropping useful signatures I guess.

@erikzhang
Copy link
Member Author

I think its not reviewing signatures shargon, its allowing to.keep sigs from PrepReponse if PrepRequest arrives later.. the previous code was dropping useful signatures I guess.

Yes

@shargon
Copy link
Member

shargon commented Oct 25, 2018

It seems good for me, we will test it! :)

@erikzhang
Copy link
Member Author

I updated it again and solved the consensus problem that is now facing. Please review it again. @shargon @vncoelho @igormcoelho

@vncoelho
Copy link
Member

vncoelho commented Oct 26, 2018

Double checked, @erikzhang. Great job and precise.

@vncoelho
Copy link
Member

vncoelho commented Oct 26, 2018

Erik, what if you verify the transactions OnPersist and, then, delete/remove the ones that fails the test along with these ones here

mem_pool.TryRemove(tx.Hash, out _);
?

@igormcoelho
Copy link
Contributor

igormcoelho commented Oct 26, 2018

@erikzhang This change just feels redundant to me, because these tx are supposed to exist on mempool already... so perhaps it's something I didn't get. The most important is to get Neo safer and safer now, even if it costs performance temporarialy, so I agree with the changes.

@shargon
Copy link
Member

shargon commented Oct 26, 2018

@erikzhang if a CN doesn't have a TX, where he asking for it?

@erikzhang
Copy link
Member Author

Erik, what if you verify the transactions OnPersist and, then, delete/remove the ones that fails the test along with these ones here

If there are a large number of transactions in the memory pool, it takes a lot of time to re-verify them all.

This change just feels redundant to me, because these tx are supposed to exist on mempool already...

After the block is persisted, all transactions in the memory pool must be re-verified.

if a CN doesn't have a TX, where he asking for it?

SignAndRelay(context.MakePrepareRequest());
if (context.TransactionHashes.Length > 1)
{
foreach (InvPayload payload in InvPayload.CreateGroup(InventoryType.TX, context.TransactionHashes.Skip(1).ToArray()))
system.LocalNode.Tell(Message.Create("inv", payload));
}

The primary will re-relay the transactions after sending the PrepareRequest.

@vncoelho
Copy link
Member

vncoelho commented Oct 26, 2018

Thanks, Erik.

If there are a large number of transactions in the memory pool, it takes a lot of time to re-verify them all.

Does this re-verifiying occurs naturally at the OnNewTransaction events (however, depends on P2P communication)?

One last question and summary, Erik. 🗡️

  1. OnPersistCompleted Self.Tell(tx, ActorRefs.NoSender) send a message to All Connections with all current reimaining TXs in its local mempool, right?
  2. Then, mem_pool is clear;
  3. Block is distributed to everyone again;
  4. Speaker FillContext only using its local memory pool with call .GetMemoryPool(). Since mem_pool was clear, Speaker memory pool just contains transactions that arrived though OnNewTransaction calls, correct?
  5. Backups checks the Primary Context and get missing transactions from mem_pool_unverified for speed up instead of waiting for a transaction to arrive naturally as OnNewTransaction.

Thanks for the patient.

Following this reasoning, one could see this mem_pool_unverified as a speed up instead of a fix?

@belane
Copy link
Member

belane commented Oct 26, 2018

Tested on private net and working fine.

==> /opt/node1/neo-cli/Logs/2018-10-26.log <==
[16:06:12.316] initialize: height=90 view=0 index=2 role=Primary
[16:06:27.323] timeout: height=90 view=0 state=Primary
[16:06:27.324] send prepare request: height=90 view=0
[16:06:27.345] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.380] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.385] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.390] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.394] relay block: 0x5440aad0193a621d92555f35b914b5e1f71743d1c94b8b94df64b69161308d4a
[16:06:27.397] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.400] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.401] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.405] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.410] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.413] persist block: 0x5440aad0193a621d92555f35b914b5e1f71743d1c94b8b94df64b69161308d4a

==> /opt/node2/neo-cli/Logs/2018-10-26.log <==
[16:06:12.322] initialize: height=90 view=0 index=0 role=Backup
[16:06:27.328] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.340] send prepare response
[16:06:27.341] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.345] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.346] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.352] relay block: 0x5440aad0193a621d92555f35b914b5e1f71743d1c94b8b94df64b69161308d4a
[16:06:27.358] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.360] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.361] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.365] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.368] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.376] persist block: 0x5440aad0193a621d92555f35b914b5e1f71743d1c94b8b94df64b69161308d4a

==> /opt/node3/neo-cli/Logs/2018-10-26.log <==
[16:06:12.337] initialize: height=90 view=0 index=3 role=Backup
[16:06:27.328] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.334] send prepare response
[16:06:27.335] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.335] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.342] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.364] relay block: 0x5440aad0193a621d92555f35b914b5e1f71743d1c94b8b94df64b69161308d4a
[16:06:27.373] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.374] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.374] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.375] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.386] OnPrepareResponseReceived: height=90 view=0 index=1
[16:06:27.398] persist block: 0x5440aad0193a621d92555f35b914b5e1f71743d1c94b8b94df64b69161308d4a

==> /opt/node4/neo-cli/Logs/2018-10-26.log <==
[16:06:12.325] initialize: height=90 view=0 index=1 role=Backup
[16:06:27.335] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.343] send prepare response
[16:06:27.347] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.347] OnPrepareRequestReceived: height=90 view=0 index=2 tx=1
[16:06:27.348] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.355] relay block: 0x5440aad0193a621d92555f35b914b5e1f71743d1c94b8b94df64b69161308d4a
[16:06:27.369] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.371] OnPrepareResponseReceived: height=90 view=0 index=3
[16:06:27.372] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.380] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.383] OnPrepareResponseReceived: height=90 view=0 index=0
[16:06:27.388] persist block: 0x5440aad0193a621d92555f35b914b5e1f71743d1c94b8b94df64b69161308d4a

@belane
Copy link
Member

belane commented Oct 26, 2018

I would prefer to re-verify the pending TXs in Blockchain.OnPersistCompleted insted to move them to mem_pool_unverified because they are now available only for backup nodes if they receive a PrepareRequest with one of these TX but are not available for the Primary in FillContext.

@erikzhang
Copy link
Member Author

If there are a lot of transactions in the memory pool, it will take a long time to re-verify them. And it will block the consensus. That's why I move them into mem_pool_unverified.

@shargon
Copy link
Member

shargon commented Oct 26, 2018

This lock the state when a Block was sent, right? do you think that is needed the commit phase after this PR?

@vncoelho
Copy link
Member

vncoelho commented Oct 26, 2018

@erikzhang, I agree with you that we should not re-verify at OnPersist.

I think that this mem_pool_unverified is a speed up and not a fix, right?
I am trying to understand how the mempool is filled again.
Nowadays, all transaction that were cleared arrive now with Self.Tell(tx, ActorRefs.NoSender) as a OnNewTransaction event one by one?

With this PR the Backups will be able to speed up the process of getting a hash that they do not know.

@erikzhang
Copy link
Member Author

I think that this mem_pool_unverified is a speed up and not a fix, right?

Right.

@vncoelho
Copy link
Member

Lets go for it and then we can merge with the commit phase for finally tackling that.

@erikzhang erikzhang merged commit 2227e16 into master Oct 27, 2018
@erikzhang erikzhang deleted the fixes/consensus branch October 27, 2018 09:56
Copy link
Member

@vncoelho vncoelho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikzhang, last time question... 🔢

byte[] hashData = context.MakeHeader()?.GetHashData();
if (hashData == null)
{
context.Signatures[payload.ValidatorIndex] = message.Signature;
Copy link
Member

@vncoelho vncoelho Oct 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikzhang, I am double checking here and just to be sure I am thinking why message.Signature is assigned here to the Signatures array.

Should not it be context.Signatures[payload.ValidatorIndex] = null?
If the hashData is null it means that the payload is kind of empty, right?

@erikzhang
Copy link
Member Author

I think @igormcoelho has answered your question:

I think its not reviewing signatures shargon, its allowing to.keep sigs from PrepReponse if PrepRequest arrives later.. the previous code was dropping useful signatures I guess.

@vncoelho
Copy link
Member

vncoelho commented Oct 27, 2018

That part of reviewing signatures I understood (I think, eahueauea).

I mean this line

context.Signatures[payload.ValidatorIndex] = message.Signature;

Should not it be context.Signatures[payload.ValidatorIndex] = null;?

Because Node B could receive wrong empty hashdata of (C,D) and then proceed to CheckSignatures(); when a correct signature of node E arrives.
And CheckSignature could pass, relaying a block, because it does not verify again.

It it is null the verification will not count them context.Signatures.Count(p => p != null):

if (context.Signatures.Count(p => p != null) >= context.M && context.TransactionHashes.All(p => context.Transactions.ContainsKey(p)))

Thacryba pushed a commit to simplitech/neo that referenced this pull request Feb 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants