-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
two p2p feature enhancement proposal #4053
Comments
Good suggestions. Not sure I understand the rationale for differentiating between persistent peers and "wildcard peers", is it because you don't want to provide a static IP for a peers that should always be allowed an inbound slot? I think the default behavior should be to allow explicitly configured peers a connection to slot (similar to what is suggested in #2041). Instead of having both a "wildcard peers" and "persistent peers", perhaps something like this would work instead:
persistent_peers_dialing_max_period sounds sensible! |
Agree, think you would get something like: This way you can also setup a connection direction without worrying about open peer slots. Also, when having persistent peers I can imaging that some of them need to kept private. When 'fixing' the persistent_peer make sure to look at #1705 so it wont expose private peers that are part of the persistent_peer. |
I like both suggestion! Thanks for sharing your thoughts! So in summary,
It sounds neater solution. How do you think? @mdrying @Creamers158 //edit We found out that putting different structures in one list(id@ip and id) is creating some implementation codebase complexity. So we came back to the original idea to have seperate wildcard_peer_ids, which only allow to write in id-only-format. Another option raised is to put inbound wildcard in persistent_peers as originally allowed id@0.0.0.0:0 format. But it seems like creating new confusion and unnecessary errors. To include every persistent_peer as a wildcard_peer is still an open question. |
Below is our up-to-date detail specs. wildcard_peer_ids1) problem
2) objective
3) implementation detail
maximum_dial_period1) problem
2) objective
3) implementation detail
|
Thanks @dlguddus for writing up a concise spec outline. Would you be able to write up the context/background and the proposal (it can be concise) into a quick ADR? This would happen ideally before a detailed spec (a spec may not even be necessary here) and implementation. That being said, I'm not too privy to the mechanics of Tendermint's p2p layer and config so correct me if I'm wrong here. The goal of wildcard peer IDs seems like it should be accomplished by the already existing persistent peer IDs config? That is, even if all connection slots are filled, persistent peers always have the privilege to connect. Is this not the case, if not, why? However, I do think the |
Difference between persistent peers and wildcard:
Because of struct difference(id@ip vs id), we think it is too complicated to put wildcard function into persistent peers. ADR : #4072 |
is #1705 not resolved yet? if it still stays in bug status, we might want to include it to this proposal. |
I admit I don't operate a validator, and I'm still coming up to speed with lots of things here. But can you share more context on
What exactly are the problems that you experienced? It sounds like this is a reasonable proposal but I'd like to better understand the issue. Thank you! |
In PoS, nodes have to share its ip and id to connect to other nodes. And there exists a significant risk from ddos attacks to even node compromisation. So what we construct is a sentry architecture, having public nodes(sentry) in front and we have internal nodes to bumper the connection from public internet to the precious validator. These internal connections should be always connected. But, because of exponential backoff and max_num_peer, a disconnected connection often never come back. This causes instability of internal(or trusted) connection quite frequently. But I think this is just a tip of iceberg. As a validator operator, we experience many other issues too. This ADR is just a pilot project to have experience with tendermint team to acquire better approach and communication from ourselves. If we think it is efficient and worth well, we would like to dig in dipper to suggest more improvement of p2p feature in tendermint. |
OK, thank you for the explanation. I hear you're coming to our dev session on Monday, and it sounds like there will be further discussion then. I'm looking forward to it! |
Refs #4053 ## Commits: * Create adr-050-improved-trusted-peering.md * Modify `maximum_dial_period` Modify `maximum_dial_period` to `persistent_peers_maximum_dial_period` * Update adr-050-improved-trusted-peering.md * Update docs/architecture/adr-050-improved-trusted-peering.md Co-Authored-By: Tess Rinearson <tess.rinearson@gmail.com> * Update docs/architecture/adr-050-improved-trusted-peering.md Co-Authored-By: Tess Rinearson <tess.rinearson@gmail.com> * Update docs/architecture/adr-050-improved-trusted-peering.md Co-Authored-By: Tess Rinearson <tess.rinearson@gmail.com> * Update docs/architecture/adr-050-improved-trusted-peering.md Co-Authored-By: Tess Rinearson <tess.rinearson@gmail.com> * wildcard -> unconditional wildcard -> unconditional * Remove blank lines * fix spelling * add quotes
I would like to suggest two p2p feature inhancement.
Please share opinions(agreement, disagreement, additional feature related to these features, possible vulnerabilities, etc). Also if anyone wants to add other improvement on p2p peer management, we can add those to our job list.(such as blacklist_peer_ids)
These two ideas were one of the most pain-in-the-ass problems for real validator operation experience. It will greatly improve the stability of connection among nodes.
Based on feedbacks from community, we start hacking on these two issues. We will ask a grant of 2k atom from community fund by governance proposal "after" we PR this. We expect PR to be ready in October.
The text was updated successfully, but these errors were encountered: