-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix background updates failing to add unique indexes on receipts #14453
Fix background updates failing to add unique indexes on receipts #14453
Conversation
As part of the database migration to support threaded receipts, there is a possible window in between `73/08thread_receipts_non_null.sql.postgres` removing the original unique constraints on `receipts_linearized` and `receipts_graph` and the `reeipts_linearized_unique_index` and `receipts_graph_unique_index` background updates from `72/08thread_receipts.sql` completing where the unique constraints on `receipts_linearized` and `receipts_graph` are missing. Any emulated upserts on these tables must therefore be performed with a lock held, otherwise duplicate rows can end up in the tables when there are concurrent emulated upserts. Note that emulated upserts no longer happen by default on sqlite, since the minimum supported version of sqlite supports native upserts by default now. Signed-off-by: Sean Quah <[email protected]>
# Then remove all duplicate receipts. | ||
# We could be clever and try to keep the latest receipt out of every set of | ||
# duplicates, but it's far simpler to remove them all. | ||
for room_id, receipt_type, user_id in duplicate_keys: | ||
sql = f""" | ||
DELETE FROM {table} | ||
WHERE | ||
room_id = ? AND | ||
receipt_type = ? AND | ||
user_id = ? AND | ||
thread_id IS NULL | ||
""" | ||
txn.execute(sql, (room_id, receipt_type, user_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth trying to preserve these read receipts?
For receipts_linearized
, we can choose to keep the receipt with the highest stream_id
.
For receipts_graph
, we have to dig into data
, which is a user-provided json field, and extract out the ts
field. This is a lot more painful, as we have to handle invalid data and ties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hope is that they're completely identical but with thread_id
being NULL
(where NULL != NULL
)? I guess that's hard to check though.
Honestly the receipts_graph
isn't read from right now so I think it is probably safe to just delete them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although deleting from receipts_linearized
will have a user-visible impact, so I'm unsure if that's a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a bit of rejigging to preserve the receipt with the highest stream_id
in receipts_linearized
. Users shouldn't notice anything now.
7d847ac
to
dc8bdbd
Compare
Before creating the `receipts_graph_unique_index` and `receipts_linearized_unique_index` unique indexes, we have to clean up any duplicate receipts that may have crept in due to #14406. Signed-off-by: Sean Quah <[email protected]>
Signed-off-by: Sean Quah <[email protected]>
dc8bdbd
to
dfe9946
Compare
@@ -113,24 +113,6 @@ def __init__( | |||
prefilled_cache=receipts_stream_prefill, | |||
) | |||
|
|||
self.db_pool.updates.register_background_index_update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why move these to the non-worker store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved them to the *BackgroundUpdateStore
. I thought that's where we usually put the background updates?
Is there a motivation for having these on the worker store that I've completely missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no motivation besides it being consistent with other examples I saw. 🤷
(I think someone non-me should review this since I made the original issue causing this. Overall I think the fix looks reasonable though.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems sane to me
# receipts_linearized has a unique constraint on | ||
# (user_id, room_id, receipt_type), so no need to lock | ||
lock=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To double check: is the table correctly deemed safe to upsert into when the relevant background updates have run? (Wasn't sure how the second commit would affect this, if at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, once the unique index has been added by the background update, we will be able to rely on native upserts again (and the value of lock
won't matter).
…_constraint_background_update
This is failing trial on postgres with |
I've updated and refactored things a bit to
|
…_constraint_background_update
…_constraint_background_update
Synapse 1.72.0 (2022-11-22) =========================== Please note that Synapse now only supports PostgreSQL 11+, because PostgreSQL 10 has reached end-of-life, c.f. our [Deprecation Policy](https://github.com/matrix-org/synapse/blob/develop/docs/deprecation_policy.md). Bugfixes -------- - Update forgotten references to legacy metrics in the included Grafana dashboard. ([\matrix-org#14477](matrix-org#14477)) Synapse 1.72.0rc1 (2022-11-16) ============================== Features -------- - Add experimental support for [MSC3912](matrix-org/matrix-spec-proposals#3912): Relation-based redactions. ([\matrix-org#14260](matrix-org#14260)) - Build Debian packages for Ubuntu 22.10 (Kinetic Kudu). ([\matrix-org#14396](matrix-org#14396)) - Add an [Admin API](https://matrix-org.github.io/synapse/latest/usage/administration/admin_api/index.html) endpoint for user lookup based on third-party ID (3PID). Contributed by @ashfame. ([\matrix-org#14405](matrix-org#14405)) - Faster joins: include heroes' membership events in the partial join response, for rooms without a name or canonical alias. ([\matrix-org#14442](matrix-org#14442)) Bugfixes -------- - Faster joins: do not block creation of or queries for room aliases during the resync. ([\matrix-org#14292](matrix-org#14292)) - Fix a bug introduced in Synapse 1.64.0rc1 which could cause log spam when fetching events from other homeservers. ([\matrix-org#14347](matrix-org#14347)) - Fix a bug introduced in 1.66 which would not send certain pushrules to clients. Contributed by Nico. ([\matrix-org#14356](matrix-org#14356)) - Fix a bug introduced in v1.71.0rc1 where the power level event was incorrectly created during initial room creation. ([\matrix-org#14361](matrix-org#14361)) - Fix the refresh token endpoint to be under /r0 and /v3 instead of /v1. Contributed by Tulir @ Beeper. ([\matrix-org#14364](matrix-org#14364)) - Fix a long-standing bug where Synapse would raise an error when encountering an unrecognised field in a `/sync` filter, instead of ignoring it for forward compatibility. ([\matrix-org#14369](matrix-org#14369)) - Fix a background database update, introduced in Synapse 1.64.0, which could cause poor database performance. ([\matrix-org#14374](matrix-org#14374)) - Fix PostgreSQL sometimes using table scans for queries against the `event_search` table, taking a long time and a large amount of IO. ([\matrix-org#14409](matrix-org#14409)) - Fix rendering of some HTML templates (including emails). Introduced in v1.71.0. ([\matrix-org#14448](matrix-org#14448)) - Fix a bug introduced in Synapse 1.70.0 where the background updates to add non-thread unique indexes on receipts could fail when upgrading from 1.67.0 or earlier. ([\matrix-org#14453](matrix-org#14453)) Updates to the Docker image --------------------------- - Add all Stream Writer worker types to `configure_workers_and_start.py`. ([\matrix-org#14197](matrix-org#14197)) - Remove references to legacy worker types in the multi-worker Dockerfile. ([\matrix-org#14294](matrix-org#14294)) Improved Documentation ---------------------- - Upload documentation PRs to Netlify. ([\matrix-org#12947](matrix-org#12947), [\matrix-org#14370](matrix-org#14370)) - Add addtional TURN server configuration example based on [eturnal](https://github.com/processone/eturnal) and adjust general TURN server doc structure. ([\matrix-org#14293](matrix-org#14293)) - Add example on how to load balance /sync requests. Contributed by [aceArt](https://aceart.de). ([\matrix-org#14297](matrix-org#14297)) - Edit sample Nginx reverse proxy configuration to use HTTP/1.1. Contributed by Brad Jones. ([\matrix-org#14414](matrix-org#14414)) Deprecations and Removals ------------------------- - Remove support for PostgreSQL 10. ([\matrix-org#14392](matrix-org#14392), [\matrix-org#14397](matrix-org#14397)) Internal Changes ---------------- - Run unit tests against Python 3.11. ([\matrix-org#13812](matrix-org#13812)) - Add TLS support for generic worker endpoints. ([\matrix-org#14128](matrix-org#14128), [\matrix-org#14455](matrix-org#14455)) - Switch to a maintained action for installing Rust in CI. ([\matrix-org#14313](matrix-org#14313)) - Add override ability to `complement.sh` command line script to request certain types of workers. ([\matrix-org#14324](matrix-org#14324)) - Enabling testing of [MSC3874](matrix-org/matrix-spec-proposals#3874) (filtering of `/messages` by relation type) in complement. ([\matrix-org#14339](matrix-org#14339)) - Concisely log a failure to resolve state due to missing `prev_events`. ([\matrix-org#14346](matrix-org#14346)) - Use a maintained Github action to install Rust. ([\matrix-org#14351](matrix-org#14351)) - Cleanup old worker datastore classes. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#14375](matrix-org#14375)) - Test against PostgreSQL 15 in CI. ([\matrix-org#14394](matrix-org#14394)) - Remove unreachable code. ([\matrix-org#14410](matrix-org#14410)) - Clean-up event persistence code. ([\matrix-org#14411](matrix-org#14411)) - Update docstring to clarify that `get_partial_state_events_batch` does not just give you completely arbitrary partial-state events. ([\matrix-org#14417](matrix-org#14417)) - Fix mypy errors introduced by bumping the locked version of `attrs` and `gitpython`. ([\matrix-org#14433](matrix-org#14433)) - Make Dependabot only bump Rust deps in the lock file. ([\matrix-org#14434](matrix-org#14434)) - Fix an incorrect stub return type for `PushRuleEvaluator.run`. ([\matrix-org#14451](matrix-org#14451)) - Improve performance of `/context` in large rooms. ([\matrix-org#14461](matrix-org#14461))
) As part of the database migration to support threaded receipts, there is a possible window in between `73/08thread_receipts_non_null.sql.postgres` removing the original unique constraints on `receipts_linearized` and `receipts_graph` and the `reeipts_linearized_unique_index` and `receipts_graph_unique_index` background updates from `72/08thread_receipts.sql` completing where the unique constraints on `receipts_linearized` and `receipts_graph` are missing. Any emulated upserts on these tables must therefore be performed with a lock held, otherwise duplicate rows can end up in the tables when there are concurrent emulated upserts. Fix the missing lock. Note that emulated upserts no longer happen by default on sqlite, since the minimum supported version of sqlite supports native upserts by default now. Finally, clean up any duplicate receipts that may have crept in before trying to create the `receipts_graph_unique_index` and `receipts_linearized_unique_index` unique indexes. Signed-off-by: Sean Quah <[email protected]>
See #14915 for a follow-up to this. |
As part of the database migration to support threaded receipts, there is
a possible window in between
73/08thread_receipts_non_null.sql.postgres
removing the originalunique constraints on
receipts_linearized
andreceipts_graph
and thereeipts_linearized_unique_index
andreceipts_graph_unique_index
background updates from
72/08thread_receipts.sql
completing wherethe unique constraints on
receipts_linearized
andreceipts_graph
aremissing. Any emulated upserts on these tables must therefore be
performed with a lock held, otherwise duplicate rows can end up in the
tables when there are concurrent emulated upserts. Fix the missing lock.
Note that emulated upserts no longer happen by default on sqlite, since
the minimum supported version of sqlite supports native upserts by
default now.
Finally, clean up any duplicate receipts that may have crept in before
trying to create the
receipts_graph_unique_index
andreceipts_linearized_unique_index
unique indexes.Signed-off-by: Sean Quah [email protected]
Resolves #14406, but not the follow on to remove
lock=False
.The first commit fixes the race that creates the duplicates.
The second commit changes the background update to clean up the duplicates.