Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto: misc refactorings #3000

Merged
merged 6 commits into from
Jan 11, 2024
Merged

Conversation

bnjbvr
Copy link
Member

@bnjbvr bnjbvr commented Jan 9, 2024

This started as a partial attempt to fix #1415, but the commit that was supposed to fix this has been removed from this PR, as it wasn't deemed useful: only refactorings remain, each explicited as an individual commit.

For history, the previous text for that PR stands below:


We're using application-level transactions to make sure that the account is properly synchronized in the cache vs in the database.

Before this commit, the transaction would be committed only when all the operations in it succeeded. This was based on the assumption that most encryption requests could be replayed, by re-sending them to the server. Unfortunately, this assumption doesn't hold for when generating one-time keys: it could be that one time-keys would be generated by the client, then the application-level transaction would fail, resulting in the client "forgetting" about the one time keys it uploaded. The server rejects reuploads of existing one-time keys, so that would end up wedging a device, causing unable-to-decrypt events, without a proper way out.

Here, we propose to save the account just after one-time keys have been generated, in a separate transaction.

A partial attempt to address #1415.

cc @kegsay @richvdh @BillCarsonFr

@bnjbvr bnjbvr requested a review from a team as a code owner January 9, 2024 12:35
@bnjbvr bnjbvr requested review from Hywan and removed request for a team January 9, 2024 12:35
@bnjbvr bnjbvr force-pushed the commit-transaction-after-generating-otk branch from 111d6da to cdd9524 Compare January 9, 2024 12:40
Copy link

codecov bot commented Jan 9, 2024

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (97b2a20) 83.50% compared to head (329157a) 83.47%.

Files Patch % Lines
crates/matrix-sdk-crypto/src/machine.rs 66.66% 1 Missing ⚠️
crates/matrix-sdk/src/sliding_sync/mod.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3000      +/-   ##
==========================================
- Coverage   83.50%   83.47%   -0.03%     
==========================================
  Files         222      222              
  Lines       23214    23213       -1     
==========================================
- Hits        19384    19378       -6     
- Misses       3830     3835       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bnjbvr bnjbvr force-pushed the commit-transaction-after-generating-otk branch 2 times, most recently from f9cfd74 to c1d8d4f Compare January 9, 2024 16:39
Copy link
Member

@Hywan Hywan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technical and well-done. Thanks for the PR!

@richvdh richvdh requested review from Hywan and richvdh and removed request for Hywan January 10, 2024 11:19
@richvdh
Copy link
Member

richvdh commented Jan 10, 2024

The server rejects reuploads of existing one-time keys

No. The server rejects uploads of a different one-time-key with the same ID as an existing one.

, so that would end up wedging a device, causing unable-to-decrypt events

The fact that the server is catching the error is actually saving you from UTDs. The real problem with uploading OTKs before they are persisted to the db is this scenario:

  • Alice's client generates an OTK but does not yet persist it to the db.
  • Alice's client uploads the OTK to the server.
  • Alice's client is killed (OOM? device is turned off?) before the key is persisted, losing all record of that OTK
  • Bob attempts to send a message to Alice, so claims the OTK
  • Alice, having restarted her client, receives the message from Bob. But it is encrypted with an OTK she has no record of. UTD error follows.

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, maybe I'm being stupid

Comment on lines 1224 to 1234
self.inner
.store
.with_transaction(|mut transaction| async {
let account = transaction.account().await.map_err(OlmError::Store)?;
account.update_key_counts(
sync_changes.one_time_keys_counts,
sync_changes.unused_fallback_keys,
);
Ok((transaction, ()))
})
.await?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I'm failing to follow how this fixes the problem.

Is there not still a possibility that committing the transaction can fail, leaving us in the same situation as before? (As I understand it, that situation is that the Account has been updated, but not persisted to the database.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that's correct that this transaction can fail, if the account cannot be fetched. The rest of the code cannot fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One solution would be to fetch the account before running the transaction maybe (if possible, because for now, the account is fetched by using the transaction).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that's correct that this transaction can fail, if the account cannot be fetched. The rest of the code cannot fail

You mean to say that it's impossible that committing the transaction, after the account is updated, can fail? With the greatest of respect, I don't believe you 😛 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't quite identified the cause of the problem yet, so we're still navigating a bit randomly here, and two days ago we thought this could be the source, but it's probably not.

But the only way to modify an Account is to get a transaction, that gives us a write-access scope to the account. There's no other way to upsert Accounts in theory; in practice, that needs to be double-checked because I'm not sure the API to write Account to the database is private/hidden/limited.

@richvdh
Copy link
Member

richvdh commented Jan 10, 2024

[Could this also fix #2998? If not, it's not a complete fix to #1415. Have updated the description on the assumption it is not.]

@bnjbvr bnjbvr force-pushed the commit-transaction-after-generating-otk branch from c1d8d4f to dd48b53 Compare January 11, 2024 12:27
@bnjbvr bnjbvr changed the title crypto: save the account immediately after generating one-time keys crypto: misc refactorings Jan 11, 2024
@bnjbvr
Copy link
Member Author

bnjbvr commented Jan 11, 2024

For what it's worth, I'm retargeting this PR to only include the drive-by refactorings I've added, and I've removed the commit that was attempting to fix, because it's pretty certain that it's not a fix.

As long as we don't have a clear repro for this issue, it seems inefficient to try to fix it.

@bnjbvr bnjbvr force-pushed the commit-transaction-after-generating-otk branch from dd48b53 to 329157a Compare January 11, 2024 14:24
@bnjbvr bnjbvr enabled auto-merge (rebase) January 11, 2024 14:25
@bnjbvr bnjbvr disabled auto-merge January 11, 2024 15:19
@bnjbvr bnjbvr merged commit 3fea9ae into main Jan 11, 2024
34 checks passed
@bnjbvr bnjbvr deleted the commit-transaction-after-generating-otk branch January 11, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lost OTK, leading to "OneTime key already exists" error and later UTDs
3 participants