-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Backfill causes wrong membership state to be sent to clients #11874
Comments
this sounds like it's basically the same as matrix-org/matrix-spec#1209, possibly with a side-helping of "pick any two: consistency, availability, partition tolerance" It would be good to get an idea of the actual event DAG in this room, if you can dig it out of the database. |
Yes, this is probably matrix-org/matrix-spec#1209, but I didn't find that again. If you can tell me what queries you need, I can run them for you. Actually, matrix-org/matrix-spec#1209 is more about state res changes not being propagated to clients. This is about events being propagated to clients, that should not affect the state due to state res. Propagating state res changes via |
you should be able to figure out the DAG by looking at the |
leave: $WK6jEn7ig8KrO5iRwyAofQeUTU81QWX-B_kOKlSUl68 kick: $V-38lZCzKnPpgtoKfZsFzJ8C1XpG4uHVfT0kdK8gN-Q invite: $Y_MQVtt5-65xLmdYVWkUnaxUImPh5em8SdMmAKFdj-k
Do you need any more edges? |
Here is a more complete log, which starts at the top of the screenshot and goes until the invite:
As you can see, Cat's message ( |
Thank you for digging out the edges. Plugging all that into graphviz:
It sounds a lot like matrix-org/matrix-spec#1209, or a variant of it? For whatever reason, client B isn't realising that A has left. |
@squahtx Kind of. The problem isn't the client realizing, that the user left, but that the user rejoined and then state events get backfilled, that include a leave event. But at this point the user is already rejoined, so the leave event doesn't actually change the room state and gets state res'ed away. But the client has no way of knowing that, it just gets sent the state event without any of that context. Frankly, that event just shouldn't be sent to clients, if it doesn't affect room state anymore. |
I am very confused. Which event ID in the DAG does user A rejoin at? And which event does backfill happen at? |
I believe this issue is touching a problem similar to https://github.com/matrix-org/matrix-doc/issues/3263, linking it for relevance |
That event isn't in the graph or screenshot. It was half an hour earlier, minutes after they left. That's why I needed to kick the user before inviting them. My client thought they were gone, but the server knew they weren't. |
Yup, this looks to be the same route cause as matrix-org/matrix-spec#1209: the server doesn't tell clients in |
So matrix-org/matrix-spec#1209 is describing a specific instance of a wider problem: the server sends events down Ideally the server would separately signal what the new state is, but currently it has no way to. We could add a bodge to drop events that don't affect state, but then you might miss important info like someone did try and join, or whatever. Closing as dupe. |
(Just make sure you fix both of those issues eventually instead of just the specific one in matrix-org/matrix-spec#1209, because I would have not expected this issue from reading matrix-org/matrix-spec#1209 alone) |
FTR it sounds like the work we're doing on sliding sync will fix this issue |
Description
Translation for non german speakers:
The above state is what I got via /sync. Basically Cat sending messages caused Tomm's events to backfill. Synapse correctly calculated, that the user is still in the room and the leave event was incorrect backfill. But it still sent it to my client. My client has no way of knowing, that this is not the latest state, so it assumed the user was not in the room anymore. But I could not invite the user again. So I kicked them and then I could invite them.
This is bad for multiple reasons. It makes tracking membership for E2EE messy and some users will be unable to decrypt messages because of this. And it also in general makes the membership list of a room unreliable.
I'm not sure what the proper solution for this would be from the spec side, but I would assume you should always send the most current state event to a client instead of outdated ones, that are not in the current room state. Maybe just in
state
to not mess with the timeline.Steps to reproduce
Those steps are not exact, because I can't actually repro it. Network issues are hard. But something like that must have happened.
Version information
If not matrix.org:
Version: 1.51
Install method: ebuild, debian package, docker
The text was updated successfully, but these errors were encountered: