-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
fix push notifications for invites from remote servers #11410
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be happy to merge in (or rebase, at your preference, since nothing has been reviewed yet) develop? I gave it a quick try but there were cases where the better push typing PR seems to have diverged, so I don't trust myself to do it properly.
Sorry for the hassle
Those weren't triggering normal pushes as they are stored as outliers in the db (= we have no local context for the event). Instead we bypass all logic around storing push actions and then reading them back from the db and instead just directly construct a push action and send it to the pushers. This only does http pushes for now.
c050ae7
to
df7880a
Compare
@reivilibre Rebased on develop! Thanks for looking at it :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing the rebase!
I've been staring at this for a while — I suspect it's not 'right' architecturally but I really wanted to try to say something helpful rather than just 'it's wrong'.
The main red flag I see with it is that push notifications are sent by a specific worker (the pusher
worker type) but in this PR, a worker handling federation will be sending the push.
I'll have to be honest with you that I'm not very familiar with the structure of the push code here and what I do know is from a few years ago..
My suspicion is that maybe we should be putting actions into the staging area, then persisting the event, with a catch clause that will back out the staging area if the persist fails. (That said, I just described the same thing that _run_push_actions_and_persist_event
does.)
Maybe the answer is that _run_push_actions_and_persist_event
needs to be adapted to be happy with outliers (at least for invite events) and then the invite code should use that instead of plain persist_events_and_notify
?
I just worry that I don't know why outliers aren't allowed in _run_push_actions_and_persist_event
^1 and that's probably pretty critical for doing this. It might be that we need to carry some special cases down for these outlier invites.
I'll put this back on the queue in case anyone else has any more intelligent thoughts.
^1: if we find out, we could at the very least document it e.g. in the docstring of _run_push_actions_and_persist_event
@@ -818,6 +818,8 @@ async def on_invite_request( | |||
event.room_id, [(event, context)] | |||
) | |||
|
|||
await self._federation_event_handler.notify_remote_invite(event, context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notes: So the reason this issue happens is that persist_events_and_notify
above is being used instead of _run_push_actions_and_persist_event
(or on_send_membership_event
which calls it) which both persists the event and sets up some push actions in the staging area^1.
The reason it's done this way seems to be that _run_push_actions_and_persist_event
requires that the event is not an outlier (event with unknown state), whereas these invites are outliers. (I'm not sure what reason this was done for: perhaps push rules need to know some state in the room, since push rules let you check things like room size etc?)
^1: (not sure what these actually are — looks like a way to prepare the results of push actions and the code that persists the events will move them over. Wonder why we bother doing all that in the database rather than passing them to the event persister....?)
Co-authored-by: reivilibre <[email protected]>
I tried to review and ended up coming to roughly the same conclusion as @reivilibre above. The change doesn't sit right architecturally, but I'm not familiar enough with this part of the code to suggest what should be done instead. Additionally, won't this send notifications twice for invites into rooms that the homeserver is already participating in? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As others have said, this crosses various architecture boundaries.
What happens if we simply relax the assert not event.internal_metadata.outlier
in _run_push_actions_and_persist_event
to allow events with event.internal_metadata.out_of_band_membership
, and call that instead of persist_events_and_notify
?
This PR has been sitting around for a while. I'm going to close it for now to tidy the review queue a bit, but please re-open it (and rebase on latest |
So while this solution works and actually solves the referenced bug, I'm not sure if this is doing things correctly. In fact I'm almost certain it's not doing things correctly, but I don't have enough insight into why the push code is structured as is to say why exactly circumventing that structure is bad. This bypasses any database persistence for this specific case of pushes and just directly sends those out.
I'm happy to restructure this with advice from someone with more background knowledge here.
Based on #11409
Closes #8626
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)