Reduce buffering between `watcher` and `Store` #1487

clux · 2024-05-09T10:31:47Z

What problem are you trying to solve?

As a follow-up to watcher paging/streaming in an effort to reduce allocations and move the complexity to where it needs to be. The main problem here is allocation for initial lists (or initial streaming lists). These are essentially allocated twice internally:

first in the watcher's step_trampolined
second in the Store's apply_watcher_event
thirdly? looks like we clone into another buffer before allocating it due to not consuming with an into_iter in the same fn?

If we can move the allocation into store and bubble up the event earlier, then we avoid double/triple allocating this, and users who write custom stores can avoid waiting or double allocating.

Note this buffering happens for both listwatch and for streaming lists.

Describe the solution you'd like

We can lift this caching with a Page<Vec<K>> or Partial<Vec<K>> new watcher::Event that can be bubbled up to be inserted into the store. Because we now have a ready guard in the store it should be safe to start inserting into the store immediately (though it would have to be altered slightly to fire after a complete initial list/stream has happened).

This is a small breaking change to the enum, but it is contained to very internal interfaces and can be documented.

Describe alternatives you've considered

flags in watcher::Config to decide whether to bubble up early, but it doesn't avoid the breaking change of introducing a new watcher event for partial data (even if we don't act on it) because it's not #[non_exhaustive]
feature flag in runtime to decide whether watcher::Event has extra features. this feels pretty hairy for an already complex watcher trampoline, and we eventually want the best performance to be the default, rather than hidden behind an opt-in

Documentation, Adoption, Migration Strategy

highlight the change in a release, users who match on the low-level watcher::Event will get a new variant to match on

Target crate for feature

kube-runtime

The text was updated successfully, but these errors were encountered:

clux · 2024-05-09T10:52:23Z

It's also been pointed out that this peak allocation might not deallocate at all for the default allocator. Important bits from a discord thread:

Another problem is that the default system allocator never returns the memory to the OS after the burst, even if the objects are dropped. Since the initial list fetch happens sporadically you get a higher RSS usage together with the memory spike. Solving the brust will solve this problem, and reflectors and watchers can be started in parallel without worrying of OOM killers.

The allocator does not return the memory to the OS since it treats it as a cache. This is mitigated by using jemalloc with some tuning, however, you still get the memory burst so our solution was to use jemalloc + start the watchers sequentially. As you can imagine it's not ideal.

clux added help wanted Not immediately prioritised, please help! runtime controller runtime related labels May 9, 2024

fabriziosestito mentioned this issue May 13, 2024

Reduce buffering between watcher and Store #1494

Merged

fabriziosestito mentioned this issue May 21, 2024

build(deps): use kube-rs reduce buffering rev kubewarden/policy-evaluator#507

Merged

clux closed this as completed in #1494 May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce buffering between `watcher` and `Store` #1487

Reduce buffering between `watcher` and `Store` #1487

clux commented May 9, 2024 •

edited

Loading

clux commented May 9, 2024

Reduce buffering between watcher and Store #1487

Reduce buffering between watcher and Store #1487

Comments

clux commented May 9, 2024 • edited Loading

What problem are you trying to solve?

Describe the solution you'd like

Describe alternatives you've considered

Documentation, Adoption, Migration Strategy

Target crate for feature

clux commented May 9, 2024

Reduce buffering between `watcher` and `Store` #1487

Reduce buffering between `watcher` and `Store` #1487

clux commented May 9, 2024 •

edited

Loading