Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for external signaling federation #12604

Merged
merged 17 commits into from
Jul 24, 2024

Conversation

danxuliu
Copy link
Member

@danxuliu danxuliu commented Jun 28, 2024

For external signaling federation the federated signaling server joins the host room in the host Nextcloud server, and establishes a connection against the host signaling server, proxying the messags from the host signaling server to the federated clients connected to it; it does not join the room in the federated Nextcloud server (from the point of view of the federated signaling server; the client does join the conversation in the federated Nextcloud server and the federated Nextcloud server proxies the join to the host Nextcloud server), although the federated Nextcloud server will anyway send some notifications to it (as it is the configured signaling server).

Unfortunately, as the session id of the participant in the federated server is used to join the conversation in the remote server, federation no longer works with local federated participants (as the session id would be duplicated).

Requires strukturag/nextcloud-spreed-signaling#776

Client changes

Note: this section is subject to change before the pull request is merged.

Clients for federated users send requests to the federated Nextcloud server, and the federated Nextcloud server proxies them as needed to the remote Nextcloud server; the clients do not directly connect to the remote Nextcloud server. In the case of the signaling server the same approach is used: the clients establish a connection to the federated signaling server, and the signaling server in turn establishes a connection with the remote signaling server.

Note, however, that the federated signaling server authenticates against the remote Nextcloud server, not the federated one, and it directly joins the remote room, not the federated one.

Besides adjusting the UI to the new features (for example, federated users now have a session in the remote server, so they can be shown as active or inactive in the conversation) the clients need to:

  • Provide some additional parameters to the external signaling server when joining a room in a federated conversation
  • Convert the roomid parameter from the remote conversation token to the federated conversation token when receiving messages in a federated conversation Not needed if both roomid and federation.roomid parameters are provided to the signaling server

Joining a room

Note that the signaling settings, for legacy reasons, can be got without specifiying the token of the conversation. This is not supported for federated conversations; when getting the signaling settings of a federated conversation the token of the federated conversation must be provided.

When the token of a federated conversation is provided the returned settings will contain the additional parameters that need to be passed to the signaling server when joining the federated room. Those parameters will be under the federation key. If no token is provided, or if the token does not belong to a federated conversation, the federation key will be still included in the settings, although with an empty value (this could change to not provide a federation key at all, as mentioned in the TODO section below).

The signaling settings must be refreshed before joining a federated conversation, as the authentication token is valid only for one minute. This also implies that, unless the connection to the local signaling server is kept open, the signaling settings will need to be fetched again when switching to a non-federated conversation and thus connecting again to the local signaling server, as any authentication token previously got for the local signaling server would be probably expired (except on very quick changes between conversations).

When joining a normal room the clients send a room message to the signaling server with the following format:

type: 'room',
room: {
    roomid: the-token-of-the-room,
    sessionid: the-nextcloud-session-id,
},

When joining a federated room the clients should send a room message with the following format instead:

type: 'room',
room: {
    roomid: conversation-token,
    sessionid: the-nextcloud-session-id,
    federation: {
        signaling: settings.federation.server,
        url: settings.federation.nextcloudServer + '/ocs/v2.php/apps/spreed/api/v3/signaling/backend',
        roomid: settings.federation.roomId,
        token: settings.federation.helloAuthParams.token,
},
  • The roomid is the token of the federated conversation (that is, the token of the conversation that the participant is directly joining, so probably nothing needs to be changed in the code)
  • The url is similar to the URL sent in hello messages, but in this case for the remote Nextcloud server (as the federated signaling server will authenticate against the remote Nextcloud server, not the federated one)
  • The federation.roomid in this case is the token of the remote room, not the federated room, so its value is provided in the settings. Also note the different casing in roomid in message.room.federation.roomid and in settings.federation.roomId

room and roomlist signaling messages

Note: after a quick look to the code in the Talk Android and Talk iOS apps this does not seem to be currently relevant. For the WebUI it works as expected without changes.

TL;DR;: for federated users ignore the properties in room signaling messages and use the ones from roomlist events instead. roomlist events with property participant-list: "refresh" are also sent if local users were added to or removed from the remote room.

When certain properties of a room are modified and a participant is active in the room (has joined it) that participant receives a room message from the signaling server with the updated properties:

{
    type: 'room',
    room: {
        properties: {
            active-since: ...,
            ...
        },
        roomid: THE-CONVERSATION-TOKEN,
    },
}

This is true too for federated participants when the remote room is modified.

However, federated participants get their room data from the proxy/federated room, not from the remote one. Currently only a subset of the remote room properties is proxied to the federated rooms, and some of the properties notified by the signaling server are not proxied to the federated rooms (for example, whether the conversation is listable or not, as that is not used in federated rooms).

Due to all that a federated user can receive a room message with updated properties that will never be reflected in the room data. Therefore the properties in room signaling messages should not be used to update the data of a federated room in clients.

The room message is sent by the signaling server to all active participants when it is notified by the Nextcloud server that the room was modified. However, a roomlist message is also sent to all the users (not participants, users) in the conversation (as instructed by the Nextcloud server in the notification, no matter if they are active or not:

{
    type: 'event',
    event: {
        target: 'roomlist',
        type: 'update',
        update: {
            properties: {
                active-since: ...,
                ...
            },
            roomid: THE-CONVERSATION-TOKEN,
        },
    },
}

As the participants that are notified are only local users the federated users do not receive a roomlist event from the remote Nextcloud server. However, as soon as the properties are propagated to the federated Nextcloud server the federated Nexcloud server will notify its own signaling server, so the federated signaling server will in turn notify the federated users in the federated/proxy room (as, for that room, they are the local users). In this case the roomlist event received by the federated users will always match the federated room properties, so it can be used to update the data of the federated room in clients.

roomlist events with updated properties are also sent when a user is invited or disinvited to the conversation. In this case invited/disvinted user receives a roomlist event with type invite or disinvite, while the rest of users receive a roomlist event with type update, and the properties include participant-list: "refresh"

Now federated users will receive the roomlist update as well when another user is invited/disinvited to the conversation (or a session is removed). Note that the federated users will not receive roomlist events with type invite or disinvite, as the process to add them to the conversation is different than for local users. Another detail is that, in this case, the roomlist update is sent by the remote Nextcloud server (as the participants are modified in the remote Nextcloud server, not the federated one), but this is just an implementation detail that clients should not be concerned about.

Converting roomid parameters

This would be needed only if roomid in the previous parameters is set to the token of the remote room and federation.roomid is not given. So... just set them and let the signaling server do the conversion :-) (but please notify it if some roomid is not converted)

Given that the federated signaling server joins the room in the remote Nextcloud server the `roomid` parameter in the proxied signaling messages will be the token of the remote room, not the federated one. Due to that the clients need to translate the received `roomid` parameter from the remote room to the federated room.

The roomid parameter appears in these messages:

type: 'room',
room: {
    roomid: 'REMOTE-TOKEN',
    ...
}

and:

type: 'event',
event: {
    type: 'message|update|disintive|switchto',
    message|update|disintive|switchto: {
        roomid: 'REMOTE-TOKEN',
        ...
    }
}

For reference this is how it was implemented in the WebUI in signaling.js:

		if (this.settings.federation?.roomId) {
			const remoteRoomId = this.settings.federation.roomId

			let roomIdConverted = false

			if (data.room?.roomid === remoteRoomId) {
				data.room.roomid = this.signalingRoomJoined
				roomIdConverted = true
			}
			if (data.event?.message?.roomid === remoteRoomId) {
				data.event.message.roomid = this.signalingRoomJoined
				roomIdConverted = true
			}
			if (data.event?.update?.roomid === remoteRoomId) {
				data.event.update.roomid = this.signalingRoomJoined
				roomIdConverted = true
			}
			if (data.event?.disinvite?.roomid === remoteRoomId) {
				data.event.disinvite.roomid = this.signalingRoomJoined
				roomIdConverted = true
			}
			if (data.event?.switchto?.roomid === remoteRoomId) {
				data.event.switchto.roomid = this.signalingRoomJoined
				roomIdConverted = true
			}

			if (OC.debug && roomIdConverted) {
				console.debug('Received, roomid converted', data)
			}
		}

Notes

TODO (could be moved to follow ups as needed)

  • Does any API version need to be bumped?
  • Reset throttling when joining a remote room, just like done when joining a room?
  • If a room is not a federated room the signaling settings still include the federation key with an empty value for consistency with other properties like stunServers. But would it be better to fully remove the key in that case?
    • If the empty key is kept it could be used to differentiate between Talk 19 and Talk 20, although not sure if that could be useful
  • Handle incompatible server versions - Federated servers compatibility between Talk 19 and Talk 20 #12805
    • A federated Talk 19 should be able to join against a Talk 20 host, although it would not be shown as active.
    • A federated Talk 20 might be able to join against a Talk 19 host, although it should be tested.
    • Getting the settings by a federated Talk 20 from a Talk 19 host would fail, and the request would be throttled. Maybe a new endpoint like settings/federation should be added to get the settings from the remote server, so in case a bad request is returned the federated server can know that the remote server is not Talk >= 20 and return federation settings?
  • Handle incompatible signaling server versions - Handle incompatible signaling servers between federated servers #12806
    • Between a federated Talk 20 and a remote Talk 20 the federation settings could be empty if both signaling modes are different, so clients would behave just like in Talk 19 (they will connect to the federated signaling server, and it would connect to the federated Nextcloud server, ignoring the remote one).
  • Add capability
    • Should an additional configuration capability to know if both the remote and the federated servers are using the same signaling server mode (external or internal, as they are not compatible)? Or just rely on the signaling settings as described above?
  • Handle errors in proxied requests (also related to incompatible server versions)
  • Throttled requests due to bruteforce protection could make the proxied request to timeout, is it possible to show an explicit error in that case?
  • Adjust to possible changes in https://github.com/strukturag/nextcloud-spreed-signaling/tree/federation
  • Cleanup
  • Make CI happy
  • Add tests
  • Documentation

Done

  • The to field in messages of type message (for example, startedTyping) is the session ID of the internal connection made from the federated signaling to the host signaling. However, this field is part of a free format, it is not explicitly known to the signaling server, which just relays it, so in this case it will probably need to be handled by the clients. Also note that (at least in the web) the to field is used to set the recipient of the message so it would not be enough to simply change the to field to the native signaling session in the federated signaling server (which, on the other hand, is not currently known to the rest of participants anyway)
    • In the end this is done by the signaling server. Even if for some messages it is a free format field the signaling server is aware of the to and from fields for offer, answer and candidate messages, and the rest of messages use the same fields with the same semantic, so it can convert them as well. Moreover, having to do the conversion is an internal detail due to how the federation is currently implemented in the signaling server, but with some changes it may not be even necessary.
  • Fix joining a federated conversation again - Right now an Integrity constraint violation: 1062 Duplicate entry 'XXX...' for key 'ts_session'" exception happens
  • Ensure that federated session IDs are never longer than the column length in the database
  • Leaving a room as a federated user does not cause a { target: "room", type: "leave" } signaling message to be sent to other participants. This will probably need to be fixed in the signaling server to leave the room when a federated client closes the connection, in a similar way to how the room is currently joined when establishing the federated connection (but this is just my guess, @fancycode will know the right way :-) ).
  • Adjust length of sender and recipient columns in talk_internalsignaling table to match the maximum session id length (512); otherwise when the cloud id is appended to the session id it overflows the column length and federation fails with an internal signaling server (including in tests).
  • roomid in proxy signaling messages received by the federated clients (for example, {"target":"participants", "type":"update"}) include the token of the conversation in the host. Should the proxy message include the token of the proxied conversation in the federated server instead? Or should the federated clients do the conversion themselves?
    • The signaling server now does it :-)
  • Events like changes in the room name or description are notified from the host server to the federated server; this modifies the proxy conversation in the federated server, which in turn causes a message to be sent by the federated Nextcloud server to the federated signaling server (just like in a normal server), so the federated participant receives a signaling message. However, the events that are notified from the host Nextcloud server to the federated Nextcloud server are a subset of the events that cause a message to be sent to the signaling server, so the federated participant does not receive all the signaling events when a room is modified and that participant is not active in the room (if the participant is in the room it will receive room messages from the host server relayed by the signaling server and roomlist messages from the federated Nextcloud server, if not active in the room only roomlist messages from the federated Nextcloud server, see below for details on roomlist messages from the host Nextcloud server relayed by the signaling server).
    • room messages are not sent by the federated signaling server to the clients when the federated Nextcloud server notifies the federated signaling server because, from the point of view of the federated signaling server, the clients are not in the federated room, they are in the host room
    • Documented in Client changes above
  • roomlist signaling messages are currently not proxied to the federated participants. Besides being used to notify clients when the Nextcloud server notifies a room update (like described above) they are also used (with type: "update") to notify other clients when another participant is invited or disinvited to the room, so federated participants do not receive those events. Maybe it will be enough to add the federated user id to the alluserids property in invite and disinvite messages sent from the Nextcloud server to the signaling server, maybe something else will be needed :-)
    • In the case of room updates the property would be userids, but it might not be needed if all the events are proxied from the host Nextcloud server to the federated Nextcloud server
    • invite/disinvite events are also notified to federated users; remote room modifications are not, as they may include data not propagated to the federated server, and it is also enough if the federated server does the notification

Follow-ups

How to test

  • Setup Nextcloud server A
  • Setup signaling server (in the federation branch) for server A
    • You may need to set federation->skipverify = false in the signaling server settings
  • Setup Nextcloud server B
  • Setup signaling server (in the federation branch) for server B
    • You may need to set federation->skipverify = false in the signaling server settings
  • In both Nextcloud servers, enable federation in Talk settings
  • "/ocm-provider" should be available in both severs; depending on your setup you may need to enable mod_rewrite and mod_env in Apache (a2enmod rewrite and a2enmod env) and set AllowOverride All for the Nextcloud directory in the main site configuration of both servers
  • If using self-signed certificates, you will need to allow them in both servers with occ config:system:set sharing.federation.allowSelfSignedCertificates --value true --type bool
  • Create a conversation in Nextcloud server A
  • Add a user from Nextcloud server B
  • In a private window, log in Nextcloud server B
  • Accept the invitation and join the conversation
  • Type a message; XXX is typing should be shown in the other window

@danxuliu danxuliu added this to the 💙 Next Major (30) milestone Jun 28, 2024
@danxuliu danxuliu mentioned this pull request Jul 10, 2024
14 tasks
@danxuliu danxuliu force-pushed the add-support-for-external-signaling-federation branch from e9ff47a to a86a894 Compare July 17, 2024 11:49
@danxuliu

This comment was marked as resolved.

@danxuliu danxuliu force-pushed the add-support-for-external-signaling-federation branch from a86a894 to 96f0959 Compare July 18, 2024 03:49
@fancycode
Copy link
Member

There is a problem with missing signaling events for federated participants when other participants join a conversation, but this looks like something that needs to be addressed in the signaling server itself (I guess because SetRoom, which clears the list of seen join events, is not called for federated clients) @fancycode

Fixed in strukturag/nextcloud-spreed-signaling@c8518de

Problem was happening if a session left a federated room and the joined again.

@fancycode
Copy link
Member

How will switchto events work in federated rooms if the target also is a federated room? The event contains the room id to switch to but from what I understand the room ids are different for federated clients.

@fancycode
Copy link
Member

The signaling server periodically sends ping requests to Nextcloud for active sessions:

private function backendPing(array $request): DataResponse {

Where should these requests should be sent to for federated sessions? The local Nextcloud or the remote/federated one?

@danxuliu
Copy link
Member Author

There is a problem with missing signaling events for federated participants when other participants join a conversation, but this looks like something that needs to be addressed in the signaling server itself (I guess because SetRoom, which clears the list of seen join events, is not called for federated clients) @fancycode

Fixed in strukturag/nextcloud-spreed-signaling@c8518de

Problem was happening if a session left a federated room and the joined again.

Tested and works 👍 Thanks!

@danxuliu
Copy link
Member Author

How will switchto events work in federated rooms if the target also is a federated room? The event contains the room id to switch to but from what I understand the room ids are different for federated clients.

Yes, the room id would be different for federated users. However, if I am not mistaken the mapping between the original room id and the federated room id is known only to the federated server, so the host would not be able to generate a switchto event with the federated room id. Even if now the signaling server performs the conversion between room id for received messages (thanks for that!) that would not work either in this case, as the conversion requires the client to have provided both room ids, so the client would need to provide the mapping of the child breakout room when joining the parent room (which I am not a fan of).

Therefore, from my point of view this could be a special case in which the clients would need to do the conversion themselves. In any case I would leave this for the future, as right now breakout rooms are not available for federated participants (breakout rooms can be configured in a conversation with federated participants, but they are not distributed to the child rooms).

@fancycode
Copy link
Member

Another question: what should happen if the connection between the signaling servers is interrupted?

  • Disconnect the federated user, so they have to reconnect / re-join (which then might fail).
  • Keep the connection to the user open (send a "link down" event) and try to reconnect / resume internally. If so, how long and what should happen if the reconnect fails.
  • Force the user to leave the room.

@danxuliu
Copy link
Member Author

The signaling server periodically sends ping requests to Nextcloud for active sessions:

private function backendPing(array $request): DataResponse {

Where should these requests should be sent to for federated sessions? The local Nextcloud or the remote/federated one?

The ping should be definitely sent from the federated signaling server to the remote Nextcloud server, as the last ping is used for things like knowing if there is an active call and generate the appropriate system message (participant started a call or participant joined a call).

Regarding pinging the federated Nextcloud server... I would also ping it if possible. Right now it does not seem to be strictly needed, as the places where the last ping value is used are currently relevant in the host/remote server. But the last ping of the current participant is anyway exposed in the lastPing field of the room data; I do not know if clients use it, but just in case :-)

@danxuliu
Copy link
Member Author

Another question: what should happen if the connection between the signaling servers is interrupted?

* Disconnect the federated user, so they have to reconnect / re-join (which then might fail).

* Keep the connection to the user open (send a "link down" event) and try to reconnect / resume internally. If so, how long and what should happen if the reconnect fails.

* Force the user to leave the room.

For simplicity I would simply disconnect the federated user and maybe add something more elaborate if needed at a later point.

I would not force the user to leave the room, though, as if the federated user is removed without the remote server being aware then the federated user would not be able to join back until invited again, but as she will still be part of the conversation from the point of view of the other participants that invitation may not come :-)

@nickvergessen
Copy link
Member

Regarding pinging the federated Nextcloud server... I would also ping it if possible. Right now it does not seem to be strictly needed,

It is, otherwise you will be spamed with notifications even though being online

@fancycode
Copy link
Member

One possible optimization:

  • When switching rooms, clients should not leave the room before joining the new one. Currently clients send a room request with an empty room id (which leaves the current room) and then another room request with the new federated room. Switching directly would allow to reuse the federated connection.

@danxuliu danxuliu force-pushed the add-support-for-external-signaling-federation branch 2 times, most recently from 20b8f9d to 1857b84 Compare July 22, 2024 14:46
@nickvergessen nickvergessen added client: 🤖🍏 mobile feature: call 📹 Voice and video calls feature: federation 🌐 feature: api 🛠️ OCS API for conversations, chats and participants labels Jul 23, 2024
This will make possible to join a room in a remote Nextcloud server with
the same session used in the local one avoiding the need to keep a map
of sessions to convert between them and ensuring that a duplicated
session id will not be used.

The column length for the session id is 512, while generated session ids
are only 255 characters long, so in most cases the cloud id can be added
as is. Only if the cloud id is longer than 256 characters (one character
needs to be reserved for the separator character) it will need to be
trimmed, but that is unlikely to happen; user ids are at most 64
characters, so the "@" plus the domain would need to be longer than 192
characters. Therefore any cloud id longer than 256 characters is just
trimmed at the end as a safety measure, but a fancier algorithm, for
example to ellipsize it, is not needed.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Federated request handling in "join/leaveRoom" is a leftover from the
past, when federation requests were still not proxied by the federated
server (and federation itself was not fully implemented). Currently
federated requests are sent only to "joinFederatedRoom", so the related
code in "join/leaveRoom" can be removed.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
…sion

Unfortunately this prevents to join a room as a local federated user, as
in that case the session will be already in use.

On the other hand, this avoids the need to convert sessions between
host and federated servers, and with that also avoids having to keep a
map between sessions.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
With them the signaling server establishes a connection with the remote
signaling server and proxies its messages, also converting the room and
session ids from the remote server to the federated server as needed.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
The signaling settings are got before joining a room, but they were
simply ignored unless the conversation cluster mode was used (as in that
case the signaling object was destroyed and the settings were applied to
the new signaling object when chaning between different signaling
servers). However, with federation the settings must be refreshed when
changing between federated rooms, or between federated and non federated
rooms, so for simplicity the settings are now always set after fetching
them.

Note that, in any case, getting the settings before joining any room
might be unnecessary and it may need to be refined, but for now it is a
convenient way to refresh the setting when changing between federated
and non federated rooms.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Before there was no session for federated participants, so it was not
possible to know if a federated participant was active or not in the
conversation. Now that federated participants join and leave the room
with a session it is possible to show them as active or not in the room.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
The cloud id will be used as the user id by the external signaling
server, which will make possible to notify federated users from the
remote Nextcloud server.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
When a participant is invited or disinvited to a conversation, or a
session is removed, the affected user receives a "roomlist" message with
type "invite" or "disinvite", while the rest of users in the
conversation receive a "roomlist" message with type "update" and a
"participant-list: refresh" property. Now both federated users and local
users receive the "roomlist" update.

Local users are also notified with a "roomlist" update when the
properties of the room change. However, in that case the signaling
server of federated users will be notified by the federated Nextcloud
server when the property changes are propagated to it, so there is no
need to notify federated users from the remote Nextcloud server in that
case.

Note, however, that independently of the users explicitly notified with
the "userids" parameter (which can include inactive participants) that
receive the "roomlist" message, the signaling server automatically
notifies all active participants in a conversation when it is modified,
so active federated users will receive a "room" message with the updated
properties (which may never be reflected in the proxy conversation, as
some properties are not propagated to the federated conversations).

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
@nickvergessen nickvergessen force-pushed the add-support-for-external-signaling-federation branch from 6da21ea to d891573 Compare July 24, 2024 13:54
Copy link
Member

@nickvergessen nickvergessen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After way too long time I seem to have got this working locally.
Was able to see typing indicators

@danxuliu
Copy link
Member Author

Another question: what should happen if the connection between the signaling servers is interrupted?

* Disconnect the federated user, so they have to reconnect / re-join (which then might fail).

* Keep the connection to the user open (send a "link down" event) and try to reconnect / resume internally. If so, how long and what should happen if the reconnect fails.

* Force the user to leave the room.

For simplicity I would simply disconnect the federated user and maybe add something more elaborate if needed at a later point.
I would not force the user to leave the room, though, as if the federated user is removed without the remote server being aware then the federated user would not be able to join back until invited again, but as she will still be part of the conversation from the point of view of the other participants that invitation may not come :-)

Reconnect has been implemented in strukturag/nextcloud-spreed-signaling@880e22b

Check the API documentation for behaviour and additional events that are sent out to the client when the reconnect happens.

Thanks! I opened an issue for this in #12808

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants