-
-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider changing /messages
to use stream ordering for events
#852
Comments
This can be very problematic with MSC2716 matrix-org/matrix-spec-proposals#2716, because it relies heavily on the ordering of events. |
I think this is basically a duplicate of https://github.com/matrix-org/matrix-doc/issues/574. |
@richvdh The issue you posted is just about clarifying the current behavior. In this issue I also try to argue why the current behavior should be changed. |
related: matrix-org/matrix-spec-proposals#474 |
As said in #spec, I think a proper solution would be an extra parameter on every endpoint that asserts ordering of events, to either receive a "canonical" timeline (as how it'd be ordered in a utopian world where everything is up 100% and there's no latency), or an "as-received" timeline (as how events are received/fetched by the server). Sync could be altered to have timeline events, which'd technically fall "outside of view", arrive in a different key, to then have the client decide how to present these to the user (possibly with a small pop-up on the side saying "hey, there's delayed events in a conversation you participated in" or some kind) |
After talking about this issue in the P2P Matrix room, I think a server can note (at least) two things to a client;
|
This can't work for backfill (history before your HS joined the room), since these events were never received by your HS in the first place. Meaning you have to ask another HS for the events, but the ordering would be different depending on which HS you backfilled from.
You can already entice a HS to send you the event in federation format using filters, via the This also partially solves this problem, at least with well-behaved homeservers:
Because you'll eventually receive an event whose This problem mostly boils down to the fact that I think the solution would require to be able to paginate through the |
|
/messages
to use stream ordering for events
For cross-linking purposes (because I keep having to search for it), this comment outlines the problem of using separate orderings between the different APIs well, as well as explaining how it prevents efficient collection of the entire room timeline client-side. |
Related: matrix-org/matrix-spec-proposals#4033 which argues that we need to be explicit about message order for receipts to make sense. |
Why not? Backfilled events are permanently stored, so the order will be consistent from the perspective of the client (it will not be consistent with other homeservers, but that is not important)
Conduit solves this by always using /sync ordering, so there are no gaps. On first glance, this looks like a good solution, but it's unfortunate that we have to add yet another field to events. I will look at it in more detail later. |
I'm actually not sure what it is exactly that you were proposing. Was it that issuing a backfill would append all newly retrieved events to the given room's sync ordering? If that's the case, then the sync ordering will be vastly different from a reasonable ordering that should be shown by the client, which I think goes contrary to how most clients work today (am I wrong?). I think that what you are proposing is to more heavily rely on client-side sorting of the sync ordering using event timestamps. Also, I wouldn't a priori dismiss having an (eventually) consistent view across homeservers as unimportant. |
Sorry I see why I was unclear. Backfilled events with Conduit will never arrive over sync and can only be seen if you call /messages far enough back. So the timeline is actually a linear list of events that can be expanded at both ends. |
I'm still not entirely following the proposal, so let's try to clarify it with an example. Suppose your server has a DAG that looks like this, with assigned stream orderings in parentheses:
... and an event
Now, Synapse assigns backfilled events a negative stream ordering, and I don't think you're proposing to change that, so our room now looks like this:
So, I think your proposal is that ... And, extending the example, let's suppose that a later event |
Okay.
Conduit will fetch as many prev_events as possible before accepting F. That means, it will recursively fetch all prev_events to get D, E and F, sort them with topological ordering and then insert them in the correct order, this gives A(1) <- B(2) <- C(3) <- D(4) <- E(5) <- F(6) Conduit only uses /backfill to load events from before our server joined the room, that means if a user scrolls up to event A and the asks /messages, Conduit will backfill events Z, Y, X and insert them like this: X(-3) <- Y(-2) <- Z(-1) <- A(1) <- B(2) <- C(3) <- D(4) <- E(5) <- F(6) Does that make things more clear? |
Out of interest, how does it fetch them? With
kinda, but it raises more questions than it answers. What happens if |
Good questions!
It's just /event for each unknown event I think.
These cases are rare, because if a server uses D as a prev_events, they can usually provide D. In the case they don't, this will lead to missing events. If D is referenced again in the future, we will try loading it again.
There is a configurable limit of how many prev_events should be loaded per incoming event, the default is ~100 which works well in normal cases. |
Empirically, they're not as rare as you might imagine, particularly in the case of transitive relationships (eg, you are receiving event F from a server, which can provide the prev_event, but not the prev_event's prev_event).
Ok, and what stream ordering is it assigned if it is successfully loaded on the second attempt? One that puts it at the start of the timeline, right?
Ok, but this is recursive, right? You need to fetch the prev_events of the prev_events? There must be a limit to the recursion as well? In any case, the point here is: you can end up with gaps in your DAG, which could get filled later. The relevance to this discussion is: how do you fit such events into your linear timeline? |
In practice this seems to work well enough. To increase our chances, we could also ask some other servers that were active in that part of the timeline.
What is the "start" for you? It will insert it as one of the most recent events: A(1) <- B(2) <- C(3) <- E(5) <- F(6) <- D(7) <-G(8)
Conduit does a breath-first search and takes the first 100 events.
In general, incoming events are appended at the bottom of the timeline, so that it arrives over /sync. When there are missing prev_events, we will handle them first before handling the incoming event. |
Sidebar: this is not an accurate representation of the DAG. The actual DAG looks like this:
Anyway, this seems like a difference between Synapse and Conduit. With Synapse, each time a client makes a ... which presumably means that, if your server is offline for more than 100 events, there will be missing events that can never be filled? [Also: if you end up pulling in state events, which might be very old, in order to handle incoming events in the timeline -- for example, because they are referenced as |
It's not quite this bad, because when other servers send events to you, they start with the oldest events, so the gap is not quite that large. And in theory, the origin servers of events should send you every event before they send newer ones. So that alone should make all events arrive. However, I know that Synapse breaks this promise if a server is offline for too long. |
Er, I don't think that's correct.
I don't think the spec says that's a thing that is required, and the decision not to send older events is very deliberate in both Synapse and the spec. It's generally considered that new events are more important than those you may have missed from days or months ago. |
That's unfortunate, because it would make my idea of ordering much more stable. So what are your thoughts on the approach? Do you think there is a way to make it happen? |
I don't know to be honest. The problem is that we are currently trying to satisfy two conflicting requirements:
(and we currently address that conflict, badly, by using different orderings for If I were forced to give up one of those two requirements, I think it would be the first. (If a message was sent 6 months ago, it probably shouldn't pop up in the user's timeline as if it had been sent in the middle of today's conversation.) Indeed I think there was some experimental work about having Element show up such "late" messages in a separate panel or something. Not sure what happened to that. Anyway, the point is: I think if I were changing things, then rather than trying to make |
The problem with (2) is that there is no way of telling clients about those historical events, so it only works if they regularly clear their caches and load it from the server again. We should ask for more opinions. |
@andybalaam should be interested in this since I discussed this with him a couple of times and I'll ping him to voice his opinion instead of me repeating it badly. My gut feeling is to side with Rich and aim for a DAG-based ordering, but there are non-trivial problems with this approach. |
My opinions are biased by my work on receipts, which crucially depend on message ordering, but don't care about historical events. (Where by historical events I mean ones that have a negative stream order in Synapse.) My strongest opinion is that we need a way to page through events in an order that is sensible for clients. I believe that the best order for clients is "stream" ordering, except in the case of historical events. My reasoning: the user sees the events in the stream order. Even when they are late-arriving, I still think that it is sensible to display them to the user in the order they arrived for that user. Anything else is deeply confusing. It is very unfortunate, but true, that we need to agree on an ordering that all clients are happy with, and provide that ordering on the server. The reason for this is that if the clients present events in a different order, receipts work extremely counter-intuitively. If receipts did not use a before/after relationship (e.g. there was one receipt per message) then we would have the freedom for different clients to use different orderings, and we could ask the server to provide enough information to allow this. But our current design for receipts constrains us to pick one true ordering. As for historical events, that is very sad. It makes me want to design some kind of hybrid ordering that says "stream order except where the event graph contradicts it". This would mean that late-arriving messages because of a netsplit would appear at the bottom of the timeline (since no event graph relationship relates any events across the split) but historical messages would be placed at the beginning (since the latest event is listed as a prevevent of an old event). But this is difficult or impossible because to do it consistently you would need all events, and sorting events would need knowledge of all events - there would be no way to compare two individual events without referring to all the others just in case there is an indirect event graph relationship. |
…remental sync (#17510) Use `stream_ordering` based `timeline` pagination for incremental `/sync` in Sliding Sync. Previously, we were always using a `topological_ordering` but we should only be using that for historical scenarios (initial `/sync`, newly joined, or haven't sent the room down the connection before). This is slightly different than what the [spec suggests](https://spec.matrix.org/v1.10/client-server-api/#syncing) > Events are ordered in this API according to the arrival time of the event on the homeserver. This can conflict with other APIs which order events based on their partial ordering in the event graph. This can result in duplicate events being received (once per distinct API called). Clients SHOULD de-duplicate events based on the event ID when this happens. But we've had a [discussion below in this PR](#17510 (comment)) and this matches what Sync v2 already does and seems like it makes sense. Created a spec issue matrix-org/matrix-spec#1917 to clarify this. Related issues: - matrix-org/matrix-spec#1917 - matrix-org/matrix-spec#852 - matrix-org/matrix-spec-proposals#4033
Today the /messages endpoint documentation does not say anything about the
ordering of the events. /sync hints at one possible ordering method:
The current concept of ordering based on event graph is not ideal for the following reasons:
or is it's own prev event. This is not checked in the federation api.
incomplete because we don't know about events of a fork
if there are too many events
Possible solution:
/messages uses the same order as /sync, which is the order at which events
arrive at the server, not sorted at all.
This has the advantage of being a lot simpler and therefore doesn't have many
edge cases.
A problem is that Matrix events often arrive in batches from other servers, so
conversations might become unreadable because you first see all messages from
one user and then all messages from the other user.
This can be improved by adding more logic in the client (for example sort the
events by timestamp) and by improving the /messages api (maybe it can send prev
events to the client to improve ordering or call /context on them).
The text was updated successfully, but these errors were encountered: