Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor backup restore to handle serialization errors and conflicts #316

Merged
merged 13 commits into from
Jan 2, 2024

Conversation

vroldanbet
Copy link
Contributor

@vroldanbet vroldanbet commented Dec 18, 2023

Related to authzed/spicedb#1544
Related to authzed/spicedb#1542

This commit refactors the logic on the backup command into a restorer type. The fundamental differences with the previous logic are:

  • batch conflicts are detected, and provides to handling strategies: fail, skip or touch when using touch, they will also be retried if needed
  • serialization errors are detected, and optionally can be retried using WithRelationships with TOUCH semantics
  • retry strategy uses a backoff

any error on bulk import will cause the stream to close, so attempts to recover from that by retrying means retrying with another bulk import stream or using WriteRelationships. The latter was used because it normalizes retries as BulkImport does not support TOUCH semantics.

use

New flags are added to the zed backup restore command:

  • --conflict-strategy defaults to fail, which means zed will fail if a conflicting relationship is found. This is kept around to remain backward compatible with the original behavior.
    • skip would skip importing any batch that contains a conflicting relationship. Please note this may skip relationships that are part of the batch but not stored in SpiceDB
    • touch will issue a WriteRelationships call with each batch
      --disable-retries is added to perform retries, which is the original behavior. Once this PR lands, retries will be enabled by default. The maximum number of retries is 10 and has an exponential backoff. Both are non-configurable.
      --request-timeout is added to add a timeout to WriteRelationship calls used on conflicts. BulkImport API calls are not subject to the timeout.

@vroldanbet vroldanbet force-pushed the backup-import-with-retries-and-conflict-handling branch 8 times, most recently from 2e08955 to 068d914 Compare December 19, 2023 13:02
@vroldanbet vroldanbet self-assigned this Dec 19, 2023
@vroldanbet vroldanbet marked this pull request as ready for review December 19, 2023 13:05
@vroldanbet vroldanbet force-pushed the backup-import-with-retries-and-conflict-handling branch from 4e6adf5 to 659f208 Compare December 19, 2023 14:18
internal/cmd/backup.go Outdated Show resolved Hide resolved
internal/cmd/backup.go Outdated Show resolved Hide resolved
internal/cmd/restorer.go Outdated Show resolved Hide resolved
internal/cmd/restorer.go Outdated Show resolved Hide resolved
internal/cmd/restorer.go Outdated Show resolved Hide resolved
internal/grpcutil/grpcutil.go Show resolved Hide resolved
@josephschorr
Copy link
Member

josephschorr commented Dec 19, 2023

Reference: authzed/spicedb#1544
Reference: authzed/spicedb#1542

@vroldanbet vroldanbet force-pushed the backup-import-with-retries-and-conflict-handling branch 2 times, most recently from 8ff0438 to 81c5100 Compare December 20, 2023 09:06
@vroldanbet vroldanbet changed the title refactor backup import to handle serialization errors and conflicts refactor backup restore to handle serialization errors and conflicts Dec 20, 2023
internal/cmd/backup.go Outdated Show resolved Hide resolved
internal/cmd/backup.go Outdated Show resolved Hide resolved
internal/cmd/cmd.go Show resolved Hide resolved
@vroldanbet vroldanbet force-pushed the backup-import-with-retries-and-conflict-handling branch from 4f20903 to 1a9fec5 Compare January 2, 2024 10:16
Copy link
Member

@josephschorr josephschorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This commit refactors the logic on the backup command into
a restorer type. The fundamental differences with the previous logic are:
- batch conflicts are detected, and provides to handling strategies: fail, skip or touch
  when using touch, they will also be retried if needed
- serialization errors are detected, and optionally can be retried using
  WithRelationships with TOUCH semantics
- retry strategy uses a backoff

any error on bulk import will cause the stream to close, so attempts to recover
from that by retrying means retrying with another bulk import stream or
using WriteRelationships. The latter was used because it normalizes retries
as BulkImport does not support TOUCH semantics.
to include MySQL and Postgres
since MySQL sometimes can get locked for a long time
on serialization errors
so the restorer becomes fully self-contained
and facilitate exporting it
the flags were not registered in the deprecated
`zed restore` command when it was added
for back-compat after the command became
`zed backup restore`
turned into a string that can be parsed into
an enum
@josephschorr josephschorr force-pushed the backup-import-with-retries-and-conflict-handling branch from 1a9fec5 to 461353e Compare January 2, 2024 19:26
@vroldanbet vroldanbet added this pull request to the merge queue Jan 2, 2024
Merged via the queue into main with commit 4fb2ba0 Jan 2, 2024
15 checks passed
@vroldanbet vroldanbet deleted the backup-import-with-retries-and-conflict-handling branch January 2, 2024 19:31
@github-actions github-actions bot locked and limited conversation to collaborators Jan 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants