-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seenMessages TTL should be based on last observed time, not first #502
Comments
Please note that we have experienced a severe issue in the past with seeing a flood of pubsub messages suddenly come in all at once. On closer observation we notice that many of the messages are duplicate messages that we have seen before. We reported it a while ago against the JS implementation here, but we have seen it more recently on go-ipfs v12 as well. In the most recent occurrence, we never saw the same message duplicated within a 2 minute window, but would often see the same messages coming into our node every 2-3 minutes. So I think it's possible that this relatively small change could have a significant benefit for us to avoid these quite disastrous pubsub floods that we have been experiencing sporadically. |
We also discussed the more recent occurrence of the pubsub flood issue in this slack thread with some PL folks, for reference: https://filecoinproject.slack.com/archives/C025ZN5LNV8/p1661459082059149 |
This makes sense, care for a patch? |
Probably as an option, it can be turned on by default. |
@vyzo Looks like I'd have to either modify or fork the timecache library as it doesn't provide any way to update the expiration time for an existing entry (it even errors if you try to add the same key twice). Do you know who the maintainer is? Any idea if I'd be better off making a PR to the existing library vs forking it or making a new TTL cache implementation? |
It's not really maintained, we should probably move it in tree. |
By default, the libp2p seen messages cache uses the `FirstSeen` strategy which expires an entry once TTL elapses from the time it was added. This means that if a single message is being received frequently and consistently, pubsub will re-broadcast it every TTL, rather than never re-broadcasting it. In the context of the Keep client which additionally uses app-level retransmissions, that often leads to a strong message amplification in the broadcast channel which cause a significant increase in the network load. As the problem is quite common (see libp2p/go-libp2p-pubsub#502), the libp2p team added a new `LastSeen` strategy which behaves differently. This strategy expires an entry once TTL elapses from the last time the message was touched by a cache write (`Add`) or read (`Has`) operation. That gives the desired behavior of never re-broadcasting a message that was already seen within the last TTL period. This reduces the risk of unintended over-amplification.
Refs: #3770 Depends on: #3771 Recent libp2p versions (we started to use them in #3771) introduced a way to set the seen messages cache TTL and strategy. Here we leverage those settings to reduce the excessive message flooding effect that sometimes occurs on mainnet. This pull request consists of two steps ### Use longer TTL for pubsub seen messages cache Once a message is received and validated, pubsub re-broadcasts it to other peers and puts it into the seen messages cache. This way, subsequent arrivals of the same message are not re-broadcasted unnecessarily. This mechanism is important for the network to avoid excessive message flooding. The default value used by libp2p is 2 minutes. However, Keep client messaging sessions are quite time-consuming so, we use a longer TTL of 5 minutes to reduce flooding risk even further. Worth noting that this time cannot be too long as the cache may grow excessively and impact memory consumption. ### Use `LastSeen` as seen messages cache strategy By default, the libp2p seen messages cache uses the `FirstSeen` strategy which expires an entry once TTL elapses from when it was added. This means that if a single message is being received frequently and consistently, pubsub will re-broadcast it every TTL, rather than never re-broadcasting it. In the context of the Keep client which additionally uses app-level retransmissions, that often leads to a strong message amplification in the broadcast channel which causes a significant increase in the network load. As the problem is quite common (see libp2p/go-libp2p-pubsub#502), the libp2p team added a new `LastSeen` strategy which behaves differently. This strategy expires an entry once TTL elapses from the last time the message was touched by a cache write (`Add`) or read (`Has`) operation. That gives the desired behavior of never re-broadcasting a message that was already seen within the last TTL period. This reduces the risk of unintended over-amplification.
By default, the libp2p seen messages cache uses the `FirstSeen` strategy which expires an entry once TTL elapses from the time it was added. This means that if a single message is being received frequently and consistently, pubsub will re-broadcast it every TTL, rather than never re-broadcasting it. In the context of the Keep client which additionally uses app-level retransmissions, that often leads to a strong message amplification in the broadcast channel which cause a significant increase in the network load. As the problem is quite common (see libp2p/go-libp2p-pubsub#502), the libp2p team added a new `LastSeen` strategy which behaves differently. This strategy expires an entry once TTL elapses from the last time the message was touched by a cache write (`Add`) or read (`Has`) operation. That gives the desired behavior of never re-broadcasting a message that was already seen within the last TTL period. This reduces the risk of unintended over-amplification. (cherry picked from commit 89a7b77)
The seenMessage cache prevents us from processing and re-broadcasting a pubsub message that we have already seen within the
seenMsgTTL
(default 2 minutes). Accessing an existing message, however, does not update the timestamp associated with the entry in the cache.This means that if a single pubsub message is being received frequently and consistently, we'll re-broadcast it every
seenMsgTTL
, rather than never re-broadcasting it as is the desired behavior. If every time we received a duplicate pubsub message we pushed back the expiration time in the cache, that would give us the desired behavior of never re-broadcasting a message that we have already seen within the lastseenMsgTTL
seconds.The text was updated successfully, but these errors were encountered: