Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Ensure we always drop the federation inbound lock #10336

Merged
merged 3 commits into from
Jul 9, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/10336.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix bug where inbound federation in a room could be delayed due to not correctly dropping a lock. Introduced in v1.37.1.
5 changes: 4 additions & 1 deletion synapse/federation/federation_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -949,7 +949,10 @@ async def _process_incoming_pdus_in_room_inner(
room_id, room_version
)
if not next:
return
# We need to ensure that we always use the lock, so that it gets
# correctly dropped.
async with lock:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't help wondering if Lock should really be implementing the context manager interface. What's the advantage over just mandating that people call lock.release() ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I find using a context manager interface clearer than using a try/finally. We could also have an explicit lock.release() to use here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I find using a context manager interface clearer than using a try/finally

well maybe, but that seems a bit irrelevant.

basically:

lock.release()

seems a more intuitive way of dropping a lock than

async with lock():
    pass

I think it's basically a bit silly to have an aenter which does nothing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a lock.release() method

Personally I find using a context manager interface clearer than using a try/finally

well maybe, but that seems a bit irrelevant.

How so? The other place we use the async with lock we'd need to convert to a try/finally if we got rid of the context manager.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? The other place we use the async with lock we'd need to convert to a try/finally if we got rid of the context manager.

Oh I see. Well, yes, we would, and tbh I think that would be an improvement. Currently it looks like something magical happens at the start of that async with block, which is misleading. The with mechanism doesn't really work very well here, because in each case we already hold the resource (ie, the lock) when we enter the with block.

Instead we could put the whole body of _process_incoming_pdus_in_room_inner inside a try block:

async def _process_incoming_pdus_in_room_inner(...):
    try:
        # ...
        while True:
            await self.handler.on_receive_pdu(event)

            # drop the lock before checking for more events
            await lock.release()
            lock = None

            next = await self.store.get_next_staged_event_for_room()
            if not next:
                break
            
            lock = await self.store.try_acquire_lock()
            if not lock:
                break
    finally:
        if lock:
            lock.release()

... thus ensuring that we always drop the lock even if we end up bailing out at odd points due to exceptions or whatever.

Anyway, what you have now is fine.


origin, event = next
else:
Expand Down