-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(4/8) Update behavior of add/remove/refresh on duplicate sequence numbers #3558
(4/8) Update behavior of add/remove/refresh on duplicate sequence numbers #3558
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
I think all changes are correct but I am not 100% sure if some sublte changes might have unwanted consequences. It will require very careful testing to ensure that we don't break anything. We should merge that into master once we don't have release-pressure to have enough time that devs are running from source and potential issues are discovered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the concurrency tests:
Bisq uses single threading model as far as possible (exceptions, P2P connection class, BitcoinJ internal classes, disc persistance classes, some util stuff,...). The P2PDataStorage has no concurrent access as all network data are mapped to the UserThread.
Thanks for the information about the threading model. I'll dig into it tomorrow to get a better understanding and refresh this stack to remove the tests and comments around impossible interleavings. |
Fix a bug where remove() was called in the addMailboxData() failure path. 1. Sender's can't remove mailbox entries. Only the receiver can remove it so even if the previous add() failed and left partial state, the remove() can never succeed. 2. Even if the sender could remove, this path used remove() instead of removeMailboxData() so it wouldn't have succeed anyway. This patch cleans up the failure path as well as adds a precondition for the remove() function to ensure future callers don't use them for ProtectedMailboxStorageEntrys.
Add comments and refactor the body in order to make it easier to understand.
Refactor addProtectedStorageEntry for more readability and add comments to help future readers.
Refactor for readability and add comments for future readers.
Refactor for readability and add comments to help future readers.
Refactor for readability and add comments for future readers.
Removed duplicate log messages that are handled inside the various helper methods and print more verbose state useful for debugging. Updated potentially misleading comments around hashing collisions
…icates Now returns false on duplicate sequence numbers. This matches more of the expected behavior for an add() function when the element previously exists. The only callers are either P2PService users that always increment the sequence number or the onMessage() handler which doesn't verify the return so there will be no visible change other than the increased readability of the code and deduplication of the code paths.
Now returns false if the sequence number of the refresh matches the last operation seen for the specified hash. This is a more expected return value when no state change occurs. The only callers are either P2PService users that always increment the sequence number or the onMessage() handler which doesn't verify the return so there will be no visible change other than the increased readability of the code and deduplication of the code paths.
…duplicate sequence #s Remove operations are now only processed if the sequence number is greater than the last operation seen for a specific payload. The only creator of new remove entrys is the P2PService layer that always increments the sequence number. So, this is either left around from a time where removes needed to work with non-incrementing sequence numbers or just a longstanding bug. With the completion of this patch, all operations now require increasing sequence numbers so it should be easier to reason about the behavior in future debugging.
a500632
to
e5f9261
Compare
Commits start at: c1ad6b4 The request got stale and polluted with merge commits and has now been updated with master. Chimp's utAck outlined the risk of possible subtle changes to the client behavior since we may return false value to the client code now when it may previously have returned true. After more review of the client caller code, I don't believe there is risk to client callers. All client code paths generate new Entrys that have sequence numbers greater than any previously seen sequence number (old + 1). This means that any new behavior in the P2PDataStore that requires increasing sequence numbers won't have any effect on those client callers since it isn't possible to generate a non-valid sequence number (with the old or new checks). Instead, this new behavior changes what happens when messages are received from either peer nodes or seed nodes that have sequence numbers identical to those that have been seen before. It was possible to allow adds-after-removes in the old code which would have re-added data and signaled listeners. Since this is a fundamental change to the P2P message handling logic. I encourage a merge when things are stable and there is no release pressure as suggested. I've done my best to add thorough tests, validate it in my regtest cluster, and run a production node with my changes and audit the logs to catch any glaring issues, but I've only been looking at this code for 3 weeks and don't have nearly the breadth of knowledge as the other developers who have touched this code recently. Thanks for considering this request and I am happy to answer any other questions that come up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utAck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
These patches changs some unexpected behavior when handling operations that have duplicate sequence numbers.
Now, if no internal state is changed the entry points will always return false.
It also fixes a subtle behavior of remove() where the code would allow duplicate add() operations if a
remove was processed with an equivalent sequence number.
Now, all operations require an increasing sequence number to update any internal state. I think this is a much easier way to think about the system, but I have no historical context for if/when these operations required equivalent sequence numbers between add() and remove().
I think it also closes a hole where a malicious node could rebroadcast an add() as a remove() and have it succeed. There are still other holes, but it moves it in the right direction.
This is the highest risk change in this stack.