Maintain history automatically #16

MatthewWid · 2021-06-26T08:02:02Z

Add a mechanism that automatically sends historical events to newly connected clients.

This is done by reading the last event ID field and then sending all events between the last event ID index and the last item in the history log. If a client does not send a last event ID field (and is thus a treated as a brand new connection) have the ability to send the entire history log.

History should be maintained per-channel (#5), and when registering a channel each will queue a list of events and then push them all at once once the client has connected.

MatthewWid · 2021-06-26T08:29:22Z

History could alternatively be maintained in a "central" history log, and then sessions and channels subscribe to this single history log. This may be more intuitive as, if a user is not using channels, they may be confused as to why history is not automatically re-populated.

The current philosophy is that broadcasting to multiple people needs history, whereas sending to a single person does not and can not have a central history.

Take, for example, the following situation when using a central log for all sessions:

User A is sent event 1.
User B is sent event 2.
User A disconnects.
User A is sent event 3.
User A reconnects, with its last event ID as event 1.

The server will recognize event 1 as the last event sent to user A and send event 2 and 3, even though event 2 was only meant for user B.

This is a security issue as users could alter their last event ID and receive "private" events sent only to other users.

In addition, identifying which events were meant for who is difficult as a shared history log with multiple clients connected to it, where some clients only need some events and others need some other events, makes it impossible to differentiate without some alternative identifier for each client, which is not possible with the native EventSource implementation.

As a workaround to this, the server can be configured to just create individual channels that the session is conditionally attached to, or otherwise only use channels and never send to a single client directly.

In the best case the developer gets lower-level control over individual open connections/sessions, and in the worst case it is the same as the current most popular server-side SSE library that forces the developer to always use a broadcast channel, but with the bonus that a session can be subscribed to multiple channels at once whilst still maintaining the full history of broadcasts to each channel.

MatthewWid · 2021-10-30T11:50:49Z

Minimal history API:

import {createChannel, createHistory, createSession} from "better-sse";
import {channelA, channelB} from "./channels";

const history = createHistory()
	.track(channelA)
	.track(channelB);

app.get("/sse", (req, res) => {
	const session = createSession(req, res);

	channelA.register(session);

	channelB.register(session);

	history.sendSinceLastId(session);
});

The above registers the session with channelA and channelB and, upon connection, sends all events broadcast on channelA and channelB to the session up to where its lastId defined.

Unlike other SSE libraries this design allows for registering multiple channels to a single session whilst still maintaining a history log for all of them. It also still accounts for channels that are conditionally registered, such as when you only want to send certain events to authorized vs. non-authorized.

Q: Why must channels be explicitly defined when creating a history log?

A: If the history log does not know all channels ahead of time it would only be able to track broadcasted events that actually reached a client - it would detect channels registered with the session and then detect broadcasts on those channels. The issue, however, is when a channel has no registered sessions or is conditionally attached to sessions broadcasted events would not be added to the history log and be lost.

Q: How does a history log know to associate a given last event ID with the history of multiple channels?

A: Each channel will generate a custom event ID, instead of the Session push method doing so itself. The history log will then listen for the broadcast and grab that event ID and append it to a single unified linear history log so that it can send back all events from all channels in chronological order.

Q: What about conditionally registered channels?

A: The history log will only send back events of the channels the session is registered with.

Q: Will we be able to edit history?

A: For the minimal API the history log will simply track events on given channels and then send all events from registered channels back to the session in chronological order. In the future, however, you will be able to do things such as retrieve the list of events, edit and remove events, set a history size, filter events added to the history, etc.

The history API will also be integrated into the upcoming Redis module, where not only will broadcasted events be published to the Redis instance, but the history log will also be stored and persisted too.

MatthewWid · 2022-01-29T06:55:27Z

Adding a note about limitations of the initial implementation: we currently use a Map from event ID to Event that allows for fast iteration, lookup and removal as well as naturally maintaining insertion order.

An issue I have found when implementing event modification, however, is that you are unable to change the ID of an existing event as it breaks the ordering of events. As Maps maintain insertion order, changing the ID of the event requires removing the old event ID key from the map and re-inserting it with a new key. Naturally, this changes the insertion order to place the old updated event at the beginning of the list, making it appear as the latest event in the history.

Some solutions to this could include:

Limit event updating to only allow the user to set the event data and event name - it is discouraged to set a custom ID for events, anyway.
Add an additional map from an internally generated auxiliary event ID to the canonical event IDs and then only update and iterate the values in the map but not the keys - this maintains insertion order but is more convoluted.
Manually tear down and re-insert all events after the updated event, placing the updated event in the correct location - this will likely have very poor performance, comparatively.

MatthewWid added the enhancement New feature or request label Jun 26, 2021

MatthewWid mentioned this issue May 30, 2022

Event batching #42

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maintain history automatically #16

Maintain history automatically #16

MatthewWid commented Jun 26, 2021

MatthewWid commented Jun 26, 2021 •

edited

Loading

MatthewWid commented Oct 30, 2021 •

edited

Loading

MatthewWid commented Jan 29, 2022 •

edited

Loading

Maintain history automatically #16

Maintain history automatically #16

Comments

MatthewWid commented Jun 26, 2021

MatthewWid commented Jun 26, 2021 • edited Loading

MatthewWid commented Oct 30, 2021 • edited Loading

MatthewWid commented Jan 29, 2022 • edited Loading

MatthewWid commented Jun 26, 2021 •

edited

Loading

MatthewWid commented Oct 30, 2021 •

edited

Loading

MatthewWid commented Jan 29, 2022 •

edited

Loading