Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Fix database performance of read/write worker locks #16061

Merged
merged 2 commits into from
Aug 17, 2023

Conversation

erikjohnston
Copy link
Member

We were seeing serialization errors when taking out multiple read locks.

The transactions were retried, so isn't causing any failures.

Introduced in #15782.

We were seeing serialization errors when taking out multiple read locks.

The transactions were retried, so isn't causing any failures.

Introduced in #15782.
@erikjohnston erikjohnston marked this pull request as ready for review August 3, 2023 17:14
@erikjohnston erikjohnston requested a review from a team as a code owner August 3, 2023 17:14
Comment on lines +98 to +100
self._clock.looping_call(
self._reap_stale_read_write_locks, _LOCK_TIMEOUT_MS / 10.0
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is every worker going to be doing this reaping? I guess that's fine since the reaps are idempotent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point but I don't think it matters in practice.

@@ -216,6 +219,7 @@ async def try_acquire_read_write_lock(
lock_name,
lock_key,
write,
db_autocommit=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a comment? Why is it unhelpful to run this function in a transaction?

I think:

  • we're still using isolation level REPEATABLE READ
  • but we're not in a transaction, so if it fails we won't bother to retry
  • which means we'll properly wait for the lock renewal period before trying to lock again, thrashing the DB less?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly: there's not much point doing a transaction for a single query as it basically has the same semantics, and by doing auto commit we reduce the work as we don't have to start/end the transaction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh: and means the length of time we hold the lock on those rows is reduced to just the query execution time, which helps reduce the likelihood of clashes

"""

if isinstance(self.database_engine, PostgresEngine):
# For Postgres we can send these queries at the same time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the point just that we now always send these as two separate queries (deletes in a background process?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, which means they're much less likely to conflict and cause retries

Copy link
Contributor

@DMRobertson DMRobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM (assuming I've understood correctly).

@erikjohnston erikjohnston merged commit eb0dbab into develop Aug 17, 2023
37 checks passed
@erikjohnston erikjohnston deleted the erikj/fix_serialisation_error branch August 17, 2023 13:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants