Avoid dummy-patching for out-of-scope handlers (via active/passive states) #731
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In #686, a complicated bug is well-described and well-debugged by @paxbit: Kopf floods the K8s API with dummy patches with no pause between them in certain conditions — specifically, when there is a persisted state of a handler that has started, didn't finish (with either success or failure), BUT came of out scope during the processing cycle (e.g. due to filters).
The bug was there since the "purposes" were introduced in #606, but was activated and made visible by fn+id handler deduplication in #674.
Reproducible with:
The fix (this PR) replaces the state's "done"/"delays" computation logic:
If a handler falls out of scope during the processing cycle, it will not be taken into account in the next "done"/"delays" computation, even if it was not finished before — this was the behaviour before the "purposes" were introduced in #606.
TODOs:
0
vs.None
delays in handler states (see Endless touch-dummy Annotation Patching On Pod Causes High API Load #686's discussion).Replaces #728. Fixes #686.
Difference from #728: There,
.done
/.delays
were changed tocheck_done(handlers)
/get_delays(handlers)
, requiring the handlers list on every check. While this is not a problem for the logic itself, it feels wrong, as the state is supposed to be stateful by definition, and we feed it the same handlers a few lines before that. In this PR, the global state remembers which handlers are "active" or "passive"; only the "active" handlers are used in.done
/.delays
, while all of them, "active" and "passive", are used in.counts
/.extras
/.purge()
with the previous logic of "[re]purposing".