Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(4/8) Update behavior of add/remove/refresh on duplicate sequence numbers #3558

Merged
merged 10 commits into from
Nov 18, 2019

Conversation

julianknutsen
Copy link
Contributor

These patches changs some unexpected behavior when handling operations that have duplicate sequence numbers.

Now, if no internal state is changed the entry points will always return false.

It also fixes a subtle behavior of remove() where the code would allow duplicate add() operations if a
remove was processed with an equivalent sequence number.

Now, all operations require an increasing sequence number to update any internal state. I think this is a much easier way to think about the system, but I have no historical context for if/when these operations required equivalent sequence numbers between add() and remove().

I think it also closes a hole where a malicious node could rebroadcast an add() as a remove() and have it succeed. There are still other holes, but it moves it in the right direction.

This is the highest risk change in this stack.

chimp1984
chimp1984 previously approved these changes Nov 5, 2019
Copy link
Contributor

@chimp1984 chimp1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

I think all changes are correct but I am not 100% sure if some sublte changes might have unwanted consequences. It will require very careful testing to ensure that we don't break anything. We should merge that into master once we don't have release-pressure to have enough time that devs are running from source and potential issues are discovered.

Copy link
Contributor

@chimp1984 chimp1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the concurrency tests:
Bisq uses single threading model as far as possible (exceptions, P2P connection class, BitcoinJ internal classes, disc persistance classes, some util stuff,...). The P2PDataStorage has no concurrent access as all network data are mapped to the UserThread.

@julianknutsen
Copy link
Contributor Author

Thanks for the information about the threading model. I'll dig into it tomorrow to get a better understanding and refresh this stack to remove the tests and comments around impossible interleavings.

@julianknutsen julianknutsen changed the title (4/4) Update behavior of add/remove/refresh on duplicate sequence numbers (4/8) Update behavior of add/remove/refresh on duplicate sequence numbers Nov 9, 2019
Fix a bug where remove() was called in the addMailboxData()
failure path.

1. Sender's can't remove mailbox entries. Only
   the receiver can remove it so even if the previous add() failed and
   left partial state, the remove() can never succeed.

2. Even if the sender could remove, this path used remove() instead
   of removeMailboxData() so it wouldn't have succeed anyway.

This patch cleans up the failure path as well as adds a precondition
for the remove() function to ensure future callers don't use them for
ProtectedMailboxStorageEntrys.
Add comments and refactor the body in order to make it easier to
understand.
Refactor addProtectedStorageEntry for more readability and add comments
to help future readers.
Refactor for readability and add comments for future readers.
Refactor for readability and add comments to help future readers.
Refactor for readability and add comments for future readers.
Removed duplicate log messages that are handled inside the various helper methods
and print more verbose state useful for debugging.

Updated potentially misleading comments around hashing collisions
…icates

Now returns false on duplicate sequence numbers. This matches more of
the expected behavior for an add() function when the element previously exists.

The only callers are either P2PService users that always increment the
sequence number or the onMessage() handler which doesn't verify the return
so there will be no visible change other than the increased readability
of the code and deduplication of the code paths.
Now returns false if the sequence number of the refresh matches
the last operation seen for the specified hash. This is a more expected
return value when no state change occurs.

The only callers are either P2PService users that always increment the
sequence number or the onMessage() handler which doesn't verify the return
so there will be no visible change other than the increased readability
of the code and deduplication of the code paths.
…duplicate sequence #s

Remove operations are now only processed if the sequence number
is greater than the last operation seen for a specific payload.

The only creator of new remove entrys is the P2PService layer that always increments
the sequence number. So, this is either left around from a time where
removes needed to work with non-incrementing sequence numbers or just
a longstanding bug.

With the completion of this patch, all operations now require increasing
sequence numbers so it should be easier to reason about the behavior in
future debugging.
@julianknutsen
Copy link
Contributor Author

julianknutsen commented Nov 11, 2019

Commits start at: c1ad6b4

The request got stale and polluted with merge commits and has now been updated with master. Chimp's utAck outlined the risk of possible subtle changes to the client behavior since we may return false value to the client code now when it may previously have returned true.

After more review of the client caller code, I don't believe there is risk to client callers. All client code paths generate new Entrys that have sequence numbers greater than any previously seen sequence number (old + 1). This means that any new behavior in the P2PDataStore that requires increasing sequence numbers won't have any effect on those client callers since it isn't possible to generate a non-valid sequence number (with the old or new checks).

Instead, this new behavior changes what happens when messages are received from either peer nodes or seed nodes that have sequence numbers identical to those that have been seen before. It was possible to allow adds-after-removes in the old code which would have re-added data and signaled listeners.

Since this is a fundamental change to the P2P message handling logic. I encourage a merge when things are stable and there is no release pressure as suggested. I've done my best to add thorough tests, validate it in my regtest cluster, and run a production node with my changes and audit the logs to catch any glaring issues, but I've only been looking at this code for 3 weeks and don't have nearly the breadth of knowledge as the other developers who have touched this code recently.

Thanks for considering this request and I am happy to answer any other questions that come up.

Copy link
Contributor

@freimair freimair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utAck

Copy link
Contributor

@ripcurlx ripcurlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@ripcurlx ripcurlx merged commit fe1dd20 into bisq-network:master Nov 18, 2019
@julianknutsen julianknutsen deleted the update-duplicate-behavior branch November 18, 2019 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants