-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable Peer Filtering #471
Conversation
Bumps [github.com/ipfs/go-datastore](https://github.com/ipfs/go-datastore) from 0.4.2 to 0.4.4. - [Release notes](https://github.com/ipfs/go-datastore/releases) - [Commits](ipfs/go-datastore@v0.4.2...v0.4.4) Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a start and I'd really appreciate feedback. The two main things are:
- Are the filters sufficiently aggressive, overly aggressive, or not aggressive enough?
- Does the API seem reasonable?
opts/filters.go
Outdated
conns := h.Network().ConnsToPeer(ai.ID) | ||
if len(conns) > 0 { | ||
for _, c := range conns { | ||
if manet.IsPrivateAddr(c.RemoteMultiaddr()) { | ||
return true | ||
} | ||
} | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be overkill in the sense that if we're already connected to the peer and we discovered them via a private DHT query then we may as well query them if we're already connected.
The approach here was to be highly defensive of our queries being polluted with non-private peers in order to keep the private DHT small and therefore make the cost of running 2 DHTs small.
Maybe we should remove this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove this check. We may learn about a few undailable peers, but we won't waste too much time dialing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I would short-circuit and, if we have a private connection, we should assume that the peer is local. I just wouldn't do the inverse "we're connected over a public IP" check.
opts/filters.go
Outdated
for _, a := range ai.Addrs { | ||
if manet.IsPublicAddr(a) { | ||
if !isRelayAddr(a) { | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is perhaps overly aggressive because it's possible that a peer has both a private address and a public one. Perhaps for queries it's not worth doing this check and only being critical of routing table additions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the idea here? Why would we want to filter out peers that have public addresses? IIRC, peers on the public network should also be in the private DHT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concern here isn't so much about undialable nodes, but about accidentally bridging the networks (e.g. starting a local DHT request and ending up with a global DHT request). However, I suspect that it's not really a big deal since if you get the data then you have it and this bridging shouldn't happen with behaving nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't filter out peers with public addresses. All peers on the local network, dialable or not, should be part of the local DHT.
Notes:
- We should have some way to pass an address filter function down into the dialer in the future. That's probably the right way to handle this.
- We should apply all the checks from https://github.com/libp2p/go-libp2p-kad-dht/pull/471/files#r387453724. E.g., we should filter out peers when (a) we know our public IPv4 address and (b) we see that it's not the same as theirs.
opts/options.go
Outdated
// QueryFilter sets a function that approves which peers may be dialed in a query | ||
func QueryFilter(filter func(h host.Host, ai peer.AddrInfo) bool) Option { | ||
return func(o *Options) error { | ||
o.QueryPeerFilter = filter | ||
return nil | ||
} | ||
} | ||
|
||
// RoutingTableFilter sets a function that approves which peers may be added to the routing table. The host should | ||
// already have at least one connection to the peer under consideration. | ||
func RoutingTableFilter(filter func(conns []core.Conn) bool) Option { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would happily take better names for these. Is it obvious from the name the when true
is returned that we keep that peer?
opts/options.go
Outdated
// QueryFilter sets a function that approves which peers may be dialed in a query | ||
func QueryFilter(filter func(h host.Host, ai peer.AddrInfo) bool) Option { | ||
return func(o *Options) error { | ||
o.QueryPeerFilter = filter | ||
return nil | ||
} | ||
} | ||
|
||
// RoutingTableFilter sets a function that approves which peers may be added to the routing table. The host should | ||
// already have at least one connection to the peer under consideration. | ||
func RoutingTableFilter(filter func(conns []core.Conn) bool) Option { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of me wants to include a host.Host
parameter in this function just in case we need it in the future. However, given that most users will probably just use our pre-built function breaking the API in the future may not be that big a deal.
query.go
Outdated
for _, np := range newPeers { | ||
closer := q.localPeers.Add(np.ID) | ||
foundCloserPeer = foundCloserPeer || closer | ||
if q.dht.queryPeerFilter(q.dht.host, *next) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How necessary is the query filter really, it seems like it's kind of "plan B" since hopefully the routing tables should be clean and only contain relevant peers. Is it ok for functions to overfilter peers here?
Do we also want to filter which addresses we add into the peerstore and/or which ones we eventually dial? If so is it important to do this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How necessary is the query filter really, it seems like it's kind of "plan B" since hopefully the routing tables should be clean and only contain relevant peers. Is it ok for functions to overfilter peers here?
For the public DHT, I'd introduce two checks:
- I can dial the peer.
- They aren't advertising relay addresses. That's a bit over-zealous, but we don't want to accidentally dial it.
For the private DHT, I'm not sure if we want to filter at all.
Do we also want to filter which addresses we add into the peerstore and/or which ones we eventually dial? If so is it important to do this now?
I'd actually keep all addresses. If the peer is on the local network, we might as well use those addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can dial the peer
What does this mean? If we don't check at all we will end up eventually dialing the peer (if they are useful). To me this sounds like the only query filter we want is for the public DHT to check for relay addresses, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can dial the peer
Sorry, I meant to say "the peer has a public address". Really, just checking for "no relay address" should be sufficient.
opts/filters.go
Outdated
addr := c.RemoteMultiaddr() | ||
if !isRelayAddr(addr) && manet.IsPublicAddr(addr) { | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a bit more aggressive than necessary since AFAIK if two peers are publicly dialable via port forwarding and are also on the same local network they likely will be connected to each other over private multiaddrs instead of public ones. This means we are potentially missing peers from our routing tables that could be there.
In the future we could add more expensive tests like pinging any non-relay public addresses we have for the peer in the peerstore (requires that the function have access to the peerstore), to see if the peer we're locally connected to is also publicly dialable.
If desired there could also be a simple check to see if (as far as we know) the peer and us share the same public network address (e.g. are behind the same NAT).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just check "do we have a public addr for this peer in the peerstore"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, but there's no guarantee that it means anything right? This is the backup for what happens if an undialable peer doesn't remove themselves from being a DHT server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would I be correct in saying that the problem we want to solve is :
"Even for a public DHT, we should add a peer to the RT if that peer is on the same network as ours, even though we might not see a public address for it" ?
Would such a peer get added to our private DHT then ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarshkshah1992 The problem we're trying to solve in this function (PublicRoutingTableFilter) is:
How can we help ourselves ensure (at minimal cost) that even if peers are not properly removing themselves from the public DHT (i.e. switching to client mode if they are undialable) that we do not add non-publicly dialable peers to our routing table?
f4e42b8
to
c276983
Compare
c276983
to
537e8f9
Compare
opts/filters.go
Outdated
return true | ||
} | ||
} | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check a few more things:
- If the remote address is in the carrier grade NAT range, abort.
- IPv4: If the remote address is our public IPv4 address, accept them (NAT hairpin).
- IPv6: If the peer is advertising the a public address that matches ours, accept them (could be abused?).
- IPv6: If the peer is advertising an address that looks like an EUI-64 Bit address, check to see if the first 64 bits are the same as ours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: on linux, we can check the route if the remote address is IPv6:
import "github.com/vishvananda/netlink"
...
routes, err := netlink.RouteGet(myIP)
for _, route := range routes {
if route.Gw == nil {
// We have a route to this host that doesn't go through a gateway. They must be on the same lan.
}
}
But we can file this away in the "to be improved" bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we can file this away in the "to be improved" bucket.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@willscott is looking into this.
opts/filters.go
Outdated
addr := c.RemoteMultiaddr() | ||
if !isRelayAddr(addr) && manet.IsPublicAddr(addr) { | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just check "do we have a public addr for this peer in the peerstore"?
opts/filters.go
Outdated
conns := h.Network().ConnsToPeer(ai.ID) | ||
if len(conns) > 0 { | ||
for _, c := range conns { | ||
if manet.IsPrivateAddr(c.RemoteMultiaddr()) { | ||
return true | ||
} | ||
} | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove this check. We may learn about a few undailable peers, but we won't waste too much time dialing them.
opts/filters.go
Outdated
for _, a := range ai.Addrs { | ||
if manet.IsPublicAddr(a) { | ||
if !isRelayAddr(a) { | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the idea here? Why would we want to filter out peers that have public addresses? IIRC, peers on the public network should also be in the private DHT.
query.go
Outdated
for _, np := range newPeers { | ||
closer := q.localPeers.Add(np.ID) | ||
foundCloserPeer = foundCloserPeer || closer | ||
if q.dht.queryPeerFilter(q.dht.host, *next) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How necessary is the query filter really, it seems like it's kind of "plan B" since hopefully the routing tables should be clean and only contain relevant peers. Is it ok for functions to overfilter peers here?
For the public DHT, I'd introduce two checks:
- I can dial the peer.
- They aren't advertising relay addresses. That's a bit over-zealous, but we don't want to accidentally dial it.
For the private DHT, I'm not sure if we want to filter at all.
Do we also want to filter which addresses we add into the peerstore and/or which ones we eventually dial? If so is it important to do this now?
I'd actually keep all addresses. If the peer is on the local network, we might as well use those addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it sounds like the requested changes are:
- Remove almost all of the query filtering logic from PublicQueryFilter and actually all of the logic from PrivateQueryFilter
- Cover more edge cases in IPv4 and IPv6 PrivateRoutingTableFilter
- Docs + Cleanup
I left a few clarifying questions
opts/filters.go
Outdated
addr := c.RemoteMultiaddr() | ||
if !isRelayAddr(addr) && manet.IsPublicAddr(addr) { | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, but there's no guarantee that it means anything right? This is the backup for what happens if an undialable peer doesn't remove themselves from being a DHT server.
opts/filters.go
Outdated
conns := h.Network().ConnsToPeer(ai.ID) | ||
if len(conns) > 0 { | ||
for _, c := range conns { | ||
if manet.IsPrivateAddr(c.RemoteMultiaddr()) { | ||
return true | ||
} | ||
} | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opts/filters.go
Outdated
for _, a := range ai.Addrs { | ||
if manet.IsPublicAddr(a) { | ||
if !isRelayAddr(a) { | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concern here isn't so much about undialable nodes, but about accidentally bridging the networks (e.g. starting a local DHT request and ending up with a global DHT request). However, I suspect that it's not really a big deal since if you get the data then you have it and this bridging shouldn't happen with behaving nodes.
opts/filters.go
Outdated
return true | ||
} | ||
} | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we can file this away in the "to be improved" bucket.
SGTM
query.go
Outdated
for _, np := range newPeers { | ||
closer := q.localPeers.Add(np.ID) | ||
foundCloserPeer = foundCloserPeer || closer | ||
if q.dht.queryPeerFilter(q.dht.host, *next) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can dial the peer
What does this mean? If we don't check at all we will end up eventually dialing the peer (if they are useful). To me this sounds like the only query filter we want is for the public DHT to check for relay addresses, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments to improve my own understanding of what we are trying to do.
opts/filters.go
Outdated
) | ||
|
||
// PublicQueryFilter returns true if the peer is suspected of being publicly accessible | ||
func PublicQueryFilter(h host.Host, ai peer.AddrInfo) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not read the addresses from the peer store ?
22b3963
to
5d313b1
Compare
537e8f9
to
eeaa1d7
Compare
5cc259c
to
afc98b9
Compare
afc98b9
to
67b9cef
Compare
…ipfs/go-datastore-0.4.4 build(deps): bump github.com/ipfs/go-datastore from 0.4.2 to 0.4.4
…igin/feat/peer-filtering
dht.go
Outdated
@@ -259,6 +264,9 @@ func makeRoutingTable(h host.Host, cfg config) (*kb.RoutingTable, error) { | |||
rtPvLogger.Errorf("failed to connect to peer %s for validation, err=%s", p, err) | |||
return false | |||
} | |||
if !cfg.routingTable.peerFilter(h, h.Network().ConnsToPeer(p)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aschmahmann : is using the RT peerFilter within the peerValidationFunction
of the routing table the right place to put it after the kbucket refactor?
dht_filters.go
Outdated
} | ||
|
||
// PrivateQueryFilter returns true if the peer is suspected of being accessible over a shared private network | ||
func PrivateQueryFilter(h host.Host, ai peer.AddrInfo) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #471 (comment)
dht_filters.go
Outdated
return true | ||
} | ||
} | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit too aggressive. If they share a public address with us, we're using NAT hairpin.
dht_filters.go
Outdated
} else { | ||
hasPrivateAddr = true | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dht_filters.go
Outdated
|
||
// PrivateRoutingTableFilter allows a peer to be added to the routing table if the connections to that peer indicate | ||
// that it is on a private network | ||
func PrivateRoutingTableFilter(h host.Host, conns []network.Conn) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, wait, why aren't we using all this logic in the PrivateQueryFilter
? The query filter should generally be less restrictive than the routing table filter.
@willscott take a look at some of the unresolved discussions including:
(you'll probably have to go through the unresolved discussions one by one) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty much ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's work through the filtering logic explaining what we're doing and why (in comments). That will help us figure out which choices appear to be arbitrary and/or inconsistent.
dht_filters.go
Outdated
|
||
for _, c := range conns { | ||
ra := c.RemoteMultiaddr() | ||
if manet.IsPrivateAddr(ra) || isRelayAddr(ra) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will add all relayed peers to the private routing table.
dht_filters.go
Outdated
if !isRelayAddr(addr) && manet.IsPublicAddr(addr) { | ||
return true | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further consideration, drop this entire check. We only want peers that announce public addresses. Direct connections form undialable peers will have public addresses.
The check below should (I think?) look at all the addresses advertised by the peer (assuming identify has finished).
Allows specifying peer filter functions on query and on adding peers to the routing table. This patch also includes some reasonable default functions for a public-only and private-only DHT. Co-authored-by: Will Scott <will@cypherpunk.email> Co-authored-by: Steven Allen <steven@stebalien.com>
Required for libp2p/go-libp2p#803. Allows peers to be filtered out of queries and routing tables, comes with a couple suggested functions for public and private network peer filtering.