Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2676: Message editing #2676

Merged
merged 28 commits into from
Jul 18, 2022
Merged
Changes from 10 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
8b95ffc
initial version of message editing proposal
uhoreg Jul 7, 2020
7f42c8b
fix MSC numbers
uhoreg Jul 7, 2020
5473009
Fix JSON in example
uhoreg Nov 24, 2020
e44f566
clarifications
uhoreg Jun 2, 2021
634dc2e
remove obsolete "XXX:", and fix a typo
uhoreg Oct 28, 2021
f9768e7
Initial cleanup and restructuring
richvdh Apr 12, 2022
1de6c11
Clarify algorithm for replacing content
richvdh Apr 12, 2022
adcdddc
background
richvdh Apr 12, 2022
fc2a690
More clarifications on applying edits
richvdh Apr 12, 2022
80c467b
Clarify behaviour of redactions
richvdh Apr 12, 2022
6cea76a
Minor grammar fixes
richvdh May 17, 2022
dac2399
Move the section on `msgtype` down
richvdh May 17, 2022
b8a7745
Clarify how edits are ordered
richvdh May 17, 2022
79a362e
Spec the behaviour for encrypted events
richvdh May 17, 2022
bb96694
Requirements for an edit event to be considered valid
richvdh May 17, 2022
dd1ca71
Collect "client behaviour" and "sever behaviour" together
richvdh May 17, 2022
eba4753
Clarify permalinks section
richvdh May 17, 2022
78550a2
Notes on edits of replies
richvdh May 17, 2022
e58715c
Clarify that `m.relates_to` within `m.new_content` is ignored
richvdh May 19, 2022
4c77c01
Clarifications from review
richvdh Jun 28, 2022
1850fb5
event ids are sorted lexicographically
richvdh Jun 28, 2022
c9e6652
Clarify aggregation section
richvdh Jun 28, 2022
1e63094
minor clarifications
richvdh Jul 12, 2022
2373873
Clarify which endpoints support edits
richvdh Jul 12, 2022
8ec4cf3
move definition of latest edit
richvdh Jul 12, 2022
08489e8
Apply suggestions from code review
richvdh Jul 13, 2022
f3d328d
fix typo
richvdh Jul 13, 2022
b245761
Attempt to clarify encrypted events
richvdh Jul 13, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
294 changes: 294 additions & 0 deletions proposals/2676-message-editing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,294 @@
# MSC2676: Message editing
richvdh marked this conversation as resolved.
Show resolved Hide resolved
richvdh marked this conversation as resolved.
Show resolved Hide resolved
turt2live marked this conversation as resolved.
Show resolved Hide resolved

Users may wish to edit previously sent messages, for example to correct typos.
This can be done by sending a new message with an indication that it replaces
the previously sent message.

This proposal is one in a series of proposals that defines a mechanism for
events to relate to each other. Together, these proposals replace
[MSC1849](https://github.com/matrix-org/matrix-doc/pull/1849).

* [MSC2674](https://github.com/matrix-org/matrix-doc/pull/2674) defines a
standard shape for indicating events which relate to other events.
* [MSC2675](https://github.com/matrix-org/matrix-doc/pull/2675) defines APIs to
let the server calculate the aggregations on behalf of the client, and so
bundle the related events with the original event where appropriate.
* This proposal defines how users can edit messages using this mechanism.
* [MSC2677](https://github.com/matrix-org/matrix-doc/pull/2677) defines how
users can annotate events, such as reacting to events with emoji, using this
mechanism.

## Background

Element-Web (then Riot-Web) and Synapse both implemented initial support for
message editing, following the proposals of MSC1849, in May 2019
([matrix-react-sdk](https://github.com/matrix-org/matrix-react-sdk/pull/2952),
[synapse](https://github.com/matrix-org/synapse/pull/5209)). Element-Android
and Element-iOS also added implementations around that time. Unfortunately,
those implementations presented the feature as "production-ready", despite it
not yet having been adopted into the Matrix specification.

The current situation is therefore that client or server implementations hoping
to interact with Element users must simply follow the examples of that
implementation. In other words, message edits form part of the *de-facto* spec
despite not being formalised in the written spec. This is clearly a regrettable
situation. Hopefully, processes have improved over the last three years so that
this situation will not arise again. Nevertheless there is little we can do
now other than formalise the status quo.

This MSC, along with the others mentioned above, therefore seeks primarily to
do that. Although there is plenty of scope for improvement, we consider that
better done in *future* MSCs, based on a shared understaning of the *current*
implementation.

In short, this MSC prefers fidelity to the current implementations over
elegance of design.

## Proposal
richvdh marked this conversation as resolved.
Show resolved Hide resolved

### `m.replace` event relationship type

A new `rel_type` of `m.replace` is defined for use with the `m.relates_to`
field as defined in
[MSC2674](https://github.com/matrix-org/matrix-doc/pull/2674). This is
intended primarily for handling edits, and lets you define an event which
replaces an existing event.

richvdh marked this conversation as resolved.
Show resolved Hide resolved
The replacement event must contain a `m.new_content` property which defines the
replacement content. (This allows the normal `body` fields to be used for a
fallback for clients who do not understand replacement events.)

For instance, an `m.room.message` which replaces an existing event might look like:

```json
{
"type": "m.room.message",
"content": {
"body": "* Hello! My name is bar",
"msgtype": "m.text",
"m.new_content": {
"body": "Hello! My name is bar",
uhoreg marked this conversation as resolved.
Show resolved Hide resolved
"msgtype": "m.text"
uhoreg marked this conversation as resolved.
Show resolved Hide resolved
},
"m.relates_to": {
"rel_type": "m.replace",
"event_id": "$some_event_id"
}
}
}
```

Such an event, with `rel_type: m.replace`, is referred to as a "message edit".

The `m.new_content` can include any properties that would normally be found in
an event's `content` property, such as `formatted_body`.

Note that the `msgtype` property of `m.room.message` events need not be the
same as in the original event. For example, if a user intended to send a
message beginning with "/me", but their client sends an `m.emote` event
instead, they could edit the message to send be an `m.text` event as they had
originally intended.

Whenever a homeserver would return an event via the Client-Server API, it
should check for any applicable `m.replace` event, and if one is found, it
should first modify the `content` of the original event according to the
`m.new_content` of the most recent edit (as determined by
`origin_server_ts`). An exception applies to [`GET
/_matrix/client/v3/rooms/{roomId}/event/{eventId}`](https://spec.matrix.org/v1.2/client-server-api/#get_matrixclientv3roomsroomideventeventid),
which should return the *unmodified* event (though the relationship should still be
"bundled", as described [below](#server-side-aggregation-of-mreplace-relationships).

Clients are generally expected to ignore message edit events, since the server
implementation takes care of updating the content of the original
event. However, if the client has already received the original event, it must
apply the replacement to the original itself (or, alternatively, request an
updated copy of the original via [`GET
/_matrix/client/v3/rooms/{roomId}/context/{eventId}`](https://spec.matrix.org/v1.2/client-server-api/#get_matrixclientv3roomsroomidcontexteventid)
or similar).

#### Applying `m.new_content`

When applying a replacement, the `content` property of the origial event is
replaced entirely by the `m.new_content`, with the exception of `m.relates_to`,
which is left *unchanged*. For example, given a pair of events:
richvdh marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"event_id": "$original_event",
"type": "m.room.message",
"content": {
"body": "I *really* like cake",
"msgtype": "m.text",
"formatted_body": "I <em>really</em> like cake",
}
}
```

```json
{
"event_id": "$edit_event",
"type": "m.room.message",
"content": {
"body": "* I *really* like *chocolate* cake",
"msgtype": "m.text",
"m.new_content": {
"body": "I *really* like *chocolate* cake",
"msgtype": "m.text",
"com.example.extension_property": "chocolate"
},
"m.relates_to": {
"rel_type": "m.replace",
"event_id": "$original_event_id"
}
}
}
```

... then the end result is an event as shown below. Note that `formatted_body`
is now absent, because it was absent in the replacement event, but
`m.relates_to` remains unchanged (ie, absent).

```json
{
"event_id": "$original_event",
"type": "m.room.message",
"content": {
"body": "I *really* like *chocolate* cake",
"msgtype": "m.text",
"com.example.extension_property": "chocolate"
}
}
```

### Permalinks

Permalinks to edited events should capture the event ID that the sender is
viewing at that point (which might be an edit ID). The client viewing the
permalink should resolve this ID to the source event ID, and then display the
most recent version of that event.

### Redactions

When a message using a `rel_type` of `m.replace` is redacted, it removes that
edit revision. This has little effect if there were subsequent edits, however
if it was the most recent edit, the event is in effect reverted to its content
before the redacted edit.

Redacting the original message in effect removes the message, including all
subsequent edits, from the visible timeline. In this situation, homeservers
will return an empty `content` for the original event as with any other
redacted event. It must be noted that, although they are not immediately
visible in Element, subsequent edits remain unredacted and can be seen via API
calls. See [Future considerations](#future-considerations).

### Server-side aggregation of `m.replace` relationships

Note that there can be multiple event with an `m.replace` relationship to a
given event (for example, if an event is edited multiple times). Homeservers
should aggregate `m.replace` relationships as in
[MSC2675](https://github.com/matrix-org/matrix-doc/pull/2675). The aggregation
turt2live marked this conversation as resolved.
Show resolved Hide resolved
gives the `event_id`, `origin_server_ts`, and `sender` of the most recent
replacement event (as determined by `origin_server_ts`).

This aggregation is bundled into the `unsigned/m.relations` property of any
event that is the target of an `m.replace` relationship. For example:

```json5

{
"event_id": "$original_event_id",
// ...
"unsigned": {
"m.relations": {
"m.replace": {
"event_id": "$latest_edit_event_id",
"origin_server_ts": 1649772304313,
"sender": "@editing_user:localhost
}
}
}
}
```

If the original event is redacted, any `m.replace` relationship should **not**
be bundled with it (whether or not any subsequent edits are themselves
redacted). Note that this behaviour is specific to the `m.replace`
relationship.

## Edge Cases

How do you handle racing edits?
* The edits could form a DAG of relations for robustness.
* Tie-break between forward DAG extremities based on origin_ts
* this should be different from the target event_id in the relations, to
make it easier to know what is being replaced.
* hard to see who is responsible for linearising the DAG when receiving.
Nasty for the client to do it, but the server would have to buffer,
meaning relations could get stuck if an event in the DAG is unavailable.
* ...or do we just always order by on origin_ts, and rely on a social problem
for it not to be abused?
* problem is that other relation types might well need a more robust way of
ordering. XXX: can we think of any?
* could add the DAG in later if it's really needed?
* the abuse vector is for a malicious moderator to edit a message with origin_ts
of MAX_INT. the mitigation is to redact such malicious messages, although this
does mean the original message ends up being vandalised... :/
* Conclusion: let's do it for origin_ts as a first cut, but use event shapes which
could be switched to DAG in future is/as needed. Good news is that it only
affects the server implementation; the clients can trust the server to linearise
correctly.

What can we edit?
* Only non-state events for now.
richvdh marked this conversation as resolved.
Show resolved Hide resolved
* We can't change event types, or anything else which is in an E2E payload
* We can't change relation types either.
richvdh marked this conversation as resolved.
Show resolved Hide resolved

How do diffs work on edits if you are missing intermediary edits?
* We just have to ensure that the UI for visualising diffs makes it clear
that diffs could span multiple edits rather than strictly be per-edit-event.

What happens when we edit a reply?
* We just send an m.replace which refers to the m.reference target; nothing
richvdh marked this conversation as resolved.
Show resolved Hide resolved
special is needed. i.e. you cannot change who the event is replying to.
richvdh marked this conversation as resolved.
Show resolved Hide resolved
* The edited reply should ditch the fallback representation of the reply itself
richvdh marked this conversation as resolved.
Show resolved Hide resolved
however from `m.new_content` (specifically the `<mx-reply>` tag in the
HTML, and the chevron prefixed text in the plaintext which we don't know
whether to parse as we don't know whether this is a reply or not), as we
can assume that any client which can handle edits can also display replies
natively.

What power level do you need to be able to edit other people's messages, and how
richvdh marked this conversation as resolved.
Show resolved Hide resolved
richvdh marked this conversation as resolved.
Show resolved Hide resolved
does it fit in with federation event auth rules?
* 50, by default?

XXX: Synapse doesn't impose this currently - it lets anyone send an edit,
but then filters them out of bundled data.

"Editing other people's messages is evil; we shouldn't allow it"
* Sorry, we have to bridge with systems which support cross-user edits.
* When it happens, we should make it super clear in the timeline that a message
was edited by a specific user.
* We do not recommend that native Matrix clients expose this as a feature.
Copy link
Contributor

@MadLittleMods MadLittleMods Dec 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not validating edits on the homeserver causes edits to be a super easy footgun for anyone writing a client or application-service. Someone can exploit this and edit anyone's message if you implement edits naively and expect the homeserver to validate edits like you can for redactions. We have first-hand experience of getting this wrong this in all of the Element clients 🥴

When exploitable, for native Matrix clients, it's most likely just a display problem since fixing the exploit on the client, results in the errant edit event being ignored and the original event shows up again or maybe even not a problem at all because Synapse filters them out of bundled data (not sure on exact Synapse details). But for bridges (application services) it can be a much bigger problem! Those edit events are not filtered out and get sent straight to the application service via /transactions. When someone naively relies on edit events to be correctly permissible against the room, those events get accidentally processed by the bridge and the information is persisted and overwritten in the 3rd party service.

Validating edits with a power-level like redactions is a pretty good way forward though ⏩

matrix-org/synapse#5364 tracks the same problem space on Synapse

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've attempted to call out the dangers of trusting the edit events. For now, that's the framework we need to work within. If that's unworkable, it's going to need another round of changes to implementations and spec, so it's out of scope here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a note where "Client authors are reminded to take note of the requirements for Validity of message edit events, and to ignore any invalid edit events that may be received." But this doesn't really call out any danger. It's such a subtle note pointing you to a list of ways you can get it wrong.

It seems like we should at least plainly point out these examples out in the security considerations section to explain what can happen.

And maybe a note about a future consideration to have it under a power-level.



what happens if `m.new_content` is absent? or the `type` is different?

## Future considerations

### Ordering of edits

In future we may wish to consider ordering replacements (or relations in
general) via a DAG rather than using `origin_server_ts` to determine ordering -
particularly to mitigate potential abuse of edits applied by moderators.
Whatever, care must be taken by the server to ensure that if there are multiple
replacement events, the server must consistently choose the same one as all
other servers.

### Redaction of edits

It is highly unintuitive that redacting the original event leaves subsequent
edits visible to curious eyes even though they are hidden from the
timeline. This is considered a bug which this MSC makes no attempt to
resolve. See also
Comment on lines +388 to +389
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the MSC could at least suggest that clients redacting events (as moderators or as senders) redact the edits too, if it knows about them...

From a moderation perspective it's certainly a good idea to redact the edits too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the MSC could at least suggest that clients redacting events redact the edits too

Is that what we actually do though? I'm reluctant to spec it if it's not a thing that happens.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not what we do, but the rationale for why we should be redacting old versions is much easier to reason about than why we didn't implement the feature in the first place.

it's long-since been considered a security-ish issue aiui that clients (including bots) don't redact old versions properly. What would be future MSC territory though is making it happen magically from the server rather than requiring the client to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, it just feels like a can of worms that I don't particularly want to open right now. What happens if the client doesn't have all the edits? (ok, it's better, but it's still a crappy solution). What happens if there are a million edits and we unexpectedly hit a rate limit?

I'm not really sure what we gain by adding a "hey, we could do this totally untested temporary hack" to an MSC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that what we actually do though? I'm reluctant to spec it if it's not a thing that happens.

Nheko does, but I don't think other clients do.

Link: https://github.com/Nheko-Reborn/nheko/blob/b6bbbdeae7966761ad2de7cdad92363c6926cfb5/src/timeline/TimelineModel.cpp#L1275

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly sure that if we were to pull stats then the average number of edits would be less than a million, well within a rate limit :p

but we're talking about abuse situations, not the "average", aren't we?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's several degrees of abuse which we can consider. Spamming edits is one form of abuse, but so is redacting a message and not the edits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well fine, this isn't a hill I'm going to die on. Can you suggest some words?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Telegram bridge also redacts all edits when a message is deleted on Telegram. It causes a fun redacted event placeholder spam on Matrix because clients don't know they're edits anymore (#3389), especially if you delete a message sent by some fun telegram bot that edits a message every second for a few hours. But appservices aren't ratelimited so it's fine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(A follow up MSC could also remove ratelimit for edit and reaction redactions)

[element-web#11978](https://github.com/vector-im/element-web/issues/11978) and
[synapse#5594](https://github.com/matrix-org/synapse/issues/5594).