Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Sync race cache invalidation fixes part 2 #14156

Conversation

Fizzadar
Copy link
Contributor

@Fizzadar Fizzadar commented Oct 12, 2022

See issue: #14154

More involved than the first fix, reviewable commit-by-commit. This essentially serves as a workaround for delayed cache invalidation by utilising the membership events we already pull as part of sync to identify any missing rooms.

Also has two always-invalidate cases for safety. I believe in both of these the frequency of them is so low that the extra invalidation won't be any performance impact.

Signed off by Nick @ Beeper (@Fizzadar).

Pull Request Checklist

This removes the extra token check and also means we can check membership
changes before the token that may get lost due to a stale cache value
returned from `get_rooms_for_user`. This can then be used to work around
a race between cache invalidation over replication and sync requests.
This avoids race conditions with cache invalidation over replication. Only
called for membership changes between two syncs that also don't show up
in the get rooms for user call and should thus be sufficiently rare that
this doesn't materially impact database performance.
This ensures no race conditions with cache invalidation. Since init syncs
are usually new devices or users this is unlikely to impact database
performance.
@Fizzadar Fizzadar marked this pull request as ready for review October 12, 2022 15:31
@Fizzadar Fizzadar requested a review from a team as a code owner October 12, 2022 15:31
# If we have no since token (init sync), ensure any cached rooms for the user
# is first invalidated to avoid race conditions with invalidation-over-replication.
if not since_token:
self.store.get_rooms_for_user.invalidate((user_id,))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really hesitant to do this without fully understanding where the inconsistencies are coming in, as per #14154 (comment). In particular, it's not obvious to me that invalidating just this cache won't lead to really weird inconsistencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this specific change wasn't based on an issue we saw more expected races similar to the other changes here. Probably makes sense to undo this commit, although the invalidating of get_users_in_room probably presents similar inconsistencies. Let's discuss in the issue and come back to this.

@Fizzadar
Copy link
Contributor Author

This is now redundant after #14723.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants