This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Optimize how we calculate likely_domains
during backfill
#13626
Labels
A-Messages-Endpoint
/messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill)
A-Performance
Performance, both client-facing and admin-facing
O-Uncommon
Most users are unlikely to come across this or unexpected workflow
S-Minor
Blocks non-critical functionality, workarounds exist.
T-Enhancement
New features, changes in functionality, improvements in performance, or user-facing enhancements.
Milestone
Mentioned in internal doc. Part of #13356
Optimize how we calculate
likely_domains
during backfill because I've seen this take 17s in production just toget_current_state
which is used toget_domains_from_state
(see case 2. Loading tons of events in the/messages
investigation issue).There are 3 ways we currently calculate hosts that are in the room:
get_current_state
->get_domains_from_state
backfill
to calculatelikely_domains
and/timestamp_to_event
because it was cargo-culted frombackfill
get_current_hosts_in_room
get_hosts_in_room_at_events
_process_event_queue_loop
Query performance
The query from
get_current_state
sucks just because we have to get all 80k events. And we see almost the exact same performance locally trying to get all of these events (16s vs 17s):But what about
get_current_hosts_in_room
: When there is 8M rows in thecurrent_state_events
table, the query inget_current_hosts_in_room
takes 13s from complete freshness (when the events were first added). But takes 930ms after a Postgres restart or 390ms if running back to back to back.See the in-flight PR #13575 for more details
The text was updated successfully, but these errors were encountered: