Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Backup Topic Creation Retries and Interrupts #540

Merged
merged 1 commit into from
Feb 15, 2023

Conversation

Fleshgrinder
Copy link
Contributor

@Fleshgrinder Fleshgrinder commented Feb 13, 2023

The retry logic in the backup code for admin client and topic creation is flawed as it compares seconds to milliseconds. Instead of aborting after the desired 60 seconds it aborts after 60,000 seconds. The retry implementation itself has several other issues, most notable the fact that it swallows BaseException, and thus cannot be interrupted by the user when it enters the retry loop.

Instead of trying to come up with an optimized retry logic or a generic solution that can be used across the board, I decided to add a well tested library that already provides this logic. I also use timedelta in favor of primitives for all durations, to avoid unit confusion. This was reverted, because we have to use an older tenacity version that does not support timedelta.

This updates only the backup code. Other code might or might not have similar flaws. For instance, after this change there are still 11 pylint: disable=bare-except left in the karapace source directory.


Requires #537 to be merged first, as it includes it and builds on top of it.

@Fleshgrinder Fleshgrinder force-pushed the fleshgrinder/backup-topic-creation-retries branch from 7e34e33 to 801e743 Compare February 13, 2023 16:09
@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Feb 13, 2023

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 676b3d8
Status: ✅  Deploy successful!
Preview URL: https://ebf80510.karapace.pages.dev
Branch Preview URL: https://fleshgrinder-backup-topic-cr.karapace.pages.dev

View logs

@Fleshgrinder Fleshgrinder force-pushed the fleshgrinder/backup-topic-creation-retries branch from 801e743 to f1342f0 Compare February 14, 2023 08:41
@Fleshgrinder Fleshgrinder self-assigned this Feb 14, 2023
@Fleshgrinder Fleshgrinder force-pushed the fleshgrinder/backup-topic-creation-retries branch from f1342f0 to 6b11d77 Compare February 14, 2023 11:53
@Fleshgrinder Fleshgrinder marked this pull request as ready for review February 14, 2023 11:53
@Fleshgrinder Fleshgrinder requested review from a team as code owners February 14, 2023 11:53
@Fleshgrinder Fleshgrinder force-pushed the fleshgrinder/backup-topic-creation-retries branch from 6b11d77 to a50bb45 Compare February 14, 2023 12:42
@Fleshgrinder Fleshgrinder force-pushed the fleshgrinder/backup-topic-creation-retries branch from a50bb45 to 7b6d323 Compare February 14, 2023 13:49
@aiven-anton aiven-anton removed their request for review February 15, 2023 10:11
@Fleshgrinder Fleshgrinder force-pushed the fleshgrinder/backup-topic-creation-retries branch from 7b6d323 to 34b90a0 Compare February 15, 2023 10:41
The retry logic in the backup code for admin client and topic creation is flawed
as it compares seconds to milliseconds. Instead of aborting after the desired
60 seconds it aborts after 60,000 seconds. The retry implementation itself has
several other issues, most notable the fact that it swallows `BaseException`,
and thus cannot be interrupted by the user when it enters the retry loop.

Instead of trying to come up with an optimized retry logic or a generic solution
that can be used across the board, I decided to add a well tested library that
already provides this logic.

This updates only the backup code. Other code might or might not have similar
flaws. For instance, after this change there are still 11
`pylint: disable=bare-except` left in the `karapace` source directory.
@Fleshgrinder Fleshgrinder force-pushed the fleshgrinder/backup-topic-creation-retries branch from 34b90a0 to 676b3d8 Compare February 15, 2023 11:11
@aiven-anton aiven-anton removed their request for review February 15, 2023 12:22
@jjaakola-aiven jjaakola-aiven merged commit 4fd802d into main Feb 15, 2023
@jjaakola-aiven jjaakola-aiven deleted the fleshgrinder/backup-topic-creation-retries branch February 15, 2023 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants