Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Pagination in a large room with restricted visibility can cause an OOM #12523

Closed
richvdh opened this issue Apr 22, 2022 · 2 comments · Fixed by #12522
Closed

Pagination in a large room with restricted visibility can cause an OOM #12523

richvdh opened this issue Apr 22, 2022 · 2 comments · Fixed by #12522
Labels
A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) A-Performance Performance, both client-facing and admin-facing T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@richvdh
Copy link
Member

richvdh commented Apr 22, 2022

Consider a large room (thousands of members) with thousands of backward extremities.

A client makes a /messages request, which triggers us to consider whether to backfill from a remote server.

This leads to this bit of code, where we look at each of the backwards extremities and decide if they are likely to be visible to any of our users. (Actually, it's worse - we look at each of the successors of the backwards extremities, which can increase the number of events by an order of magnitude.) For each event we consider in this way, we end up pulling the event ids for the entire room state at that event into memory. (We don't actually pull the events themselves, unless they are membership events for our local server).

We can easily end up trying to hold tens of millions of event ids in memory at once for this. It completely blows the *stateGroupMembersCache*, and leads to an OOM after a minute or two.

@richvdh
Copy link
Member Author

richvdh commented Apr 22, 2022

note that we do that filtering for every single /messages request - even if we don't need to backfill.

It's also worth noting that this is rather self-exacerbating. Because we can't backfill, we can never "heal" the backwards extremities, so we end up accumulating more and more.

@richvdh
Copy link
Member Author

richvdh commented Apr 25, 2022

This seems to date from #4699 (Synapse 0.99.3), which, ironically, contains the review comment:

are we not likely to end up doing the filter_events_for server thing (which is quite expensive) repeatedly, every time anyone asks for some backfill?

@MadLittleMods MadLittleMods added A-Performance Performance, both client-facing and admin-facing T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. labels Apr 25, 2022
@MadLittleMods MadLittleMods added the A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) label Jul 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) A-Performance Performance, both client-facing and admin-facing T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants