-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
batch outgoing RRs better #4730
Comments
In order to reduce the number of outgoing federation transactions, we want to aggregate read-receipts, since there is little harm if outgoing RRs are delayed by a few seconds, and we can dramatically reduce the number of transactions by doing so. This change introduces the concept of EDU 'buckets'; currently we have the 'Instant' bucket and the 'Delayed' bucket. Fixes #4730, #3951.
I've been doing some analysis on RRs to determine how we can best handle them. The following analysis is based on a period of a few hours today, which is a fairly average time of day, but still with enough RR peaks to let the federation sender get a couple of minutes behind at times. Firstly: the number of RR EDUs increases as Some datapoints about RR age: 16% arrive within 1 second of their corresponding event, and 37% within 5 seconds. RR EDUs tend to be somewhat older on average: only 7% happen within 1 second of their event, and 28% within 5 seconds: Presumably this means that RRs happen later in large rooms like Matrix HQ. Probably makes sense: people are more likely to read messages in one-to-one rooms immediately. It does suggest that we might want to consider using a different bucketing scheme for larger rooms. In the sample period, there were 399 RRs for Matrix HQ, each of which would have been sent to around 667 servers. (This will include "offline" servers, and in practice some EDUs will be grouped into transactions if the remote server did not respond immediately). |
Some analysis on different bucketing techniques as they would affect Matrix HQ:
Any other suggestions? |
Basing it on a factor of the age feels like a good one, and half feels about right. I might be tempted to squish the low age RR's a bit more so that things feel a bit more instant (e.g. |
399->244 and 399->228 seem to be relatively modest improvements, although I much prefer the idea of only aggressively batching RRs for older events. However, I'm finding it hard to visualise the circumstances when RRs for older events even happen. Are they are ones generated by clients as they catch up on scrollback? I'm a bit confused as to why we'd be sending these, as all clients I know of load the room at the bottom of the page, where they'd send an RR for the most recent event... and so RRs for historical events are irrelevant (other than for updating the RM, which is specific to the local user). |
erik's suggestion of |
Unless somebody happens to be watching the current timeline of a given room at the point that an event is sent, there will be a delay before they open the app, switch to the room, etc. |
oh, of course. so the "old event" could still be the most recent one in the room... it's just that we don't rapidly show RRs as people catch up on the message if that message happens to be old? I'm a bit worried that this could still feel weird (especially if the message is an hour old, and it takes 30 minutes for the RR to be sent?!), given from a user POV the expectation people are used to is that idle users coming back online should immediately send an RR as they catch up (especially given we lack presence). |
I don't quite understand the question. I think you're describing my proposed behaviour.
This is why I'm suggesting capping at 30 seconds. |
I was repeating my understanding back to check i had it right, which it seems like i do. Would an alternative approach be to simply throttle the rate at which we send RRs in a room? So if things are slow (e.g. idle users returning and catching up on scrollback), the RRs come in rapidly, acting as a nice proxy for presence. But if a conversation is in full swing and RRs are flying around everywhere, we do something like "don't send RRs more rapidly than 1 batch every N seconds, per room"? This would then give us "only throttle in a crisis" behaviour. Or is that what the original PR did? :S |
Ooh, nice. FTR that basically shakes out to adding a 50% delay for older events (e.g. if its taken you 15s to send an RR then we delay it by ~7s), but much less for recent events (e.g. if we send an RR within 2s then we'll only delay by ~600ms and 1s delayed by ~250ms). |
presumably if we are delaying by 7s, in practice this means that it'll get sent sooner anyway if a less-delayed RR needs to get sent in the interim? |
Yup, or something else needs to get sent like a typing notification, events, etc. |
It's not quite, because (a) the throttling in #4777 was independent of rooms; (b) it doesn't send anything out immediately. So: Matthew's suggestion of throttling RRs per room to one batch every N seconds: 190 transmissions with N=30. |
i was thinking a much lower value of N ftr - e.g. 3-5s |
well, not that nice imho. 245 is a bit of a marginal reduction. |
Doesn't that mean a second receipt sent in a room from a given server will potentially be delayed by up to 30s? |
At N=3: 339 transmissions |
well, yes. |
The point being that currently it's unusual for two RRs to land within 3s of each other, so this doesn't change much. |
hum. what are we even trying to optimise for here? i thought the problem was that CPU was spiking/saturating when there's a lot of traffic, and we end up sending RRs too frequently because we don't batch them. Are we trying to reduce the total number of RRs, or just smooth things out to not muller the CPU? In which case perhaps 399->339 is okay, as long as they're spread out nicely no more rapidly than every 3s? |
We're trying to reduce the frequency of transactions, hence the idea of waiting a bit to see if any more RRs come in before sending the transaction.
The problem is that for large rooms we need to send a transaction to each host in the room, so every 3s could easily mean 100Hz of outbound transactions. |
okay, could we pick the "send RRs no more rapidly than every N seconds" such that N ensures less than X Hz of outbound transactions? So small rooms don't get crippled, but big ones do? |
Having thought about this a bit harder: yes you're right that this does make a difference, because although the reduction is only 20%, it happens at the most beneficial time. However, as erik says, it's not enough, because empirically we top out around 150Hz of transactions and one transaction every 3 seconds to each of 600 servers is 200Hz.
yes, this definitely sounds like a possible strategy. Let's say you're aiming for 100Hz of transactions: when you get a RR in HQ, you realise it needs to go to 667 servers, so you say that you won't forward any more RRs for We might need to be careful about what happens if there are two rooms with activity at once, but this is currently rare. |
This does mean for large rooms RR will be significantly delayed, as the first RR will go out to all servers, and then it waits 7s before sending any more. That sounds like its gonna be quite noticeable in a conversation tbh. I'm also slightly wary that instead of defining this in terms of "this is an acceptable delay" which is fairly static, we now have a magic parameter that we'd need to keep tweaking as our load/usage patterns change (e.g. how many large rooms have concurrent conversations). |
well right, but empirically going any faster than that means we start getting behind.
I suspect we'd end up tweaking the "acceptable delay" just as much. |
It might just be sufficient to maintain a queue and stack the batches of RRs to go out in series? |
well, that's what happens at the moment. The problem is that everything else gets caught up in it. It's difficult to come up with a mechanism where RRs are queued separately to events without RRs getting completely starved out. |
I think we need to try something here rather than continue to theorise about what might or might not feel ok. I'm going to try Matthew's suggestion of rate-limiting RRs by room size. As Erik points out, that might make RRs feel unacceptably laggy, so we might need to consider tweaking it in the future. |
This is mostly a prerequisite for #4730, but also fits with the general theme of "move everything off the master that we possibly can".
Rate-limit outgoing read-receipts as per #4730.
Rate-limit outgoing read-receipts as per #4730.
this has been deployed on matrix.org since 2019-03-21 14:14:49, ftr. |
I made a heinous bodge which delays all outgoing EDUs (including typing notifs) by 5 seconds. We need to do somethig better
The text was updated successfully, but these errors were encountered: