You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a follow-up to watcher paging/streaming in an effort to reduce allocations and move the complexity to where it needs to be. The main problem here is allocation for initial lists (or initial streaming lists). These are essentially allocated twice internally:
thirdly? looks like we clone into another buffer before allocating it due to not consuming with an into_iter in the same fn?
If we can move the allocation into store and bubble up the event earlier, then we avoid double/triple allocating this, and users who write custom stores can avoid waiting or double allocating.
We can lift this caching with a Page<Vec<K>> or Partial<Vec<K>> new watcher::Event that can be bubbled up to be inserted into the store. Because we now have a ready guard in the store it should be safe to start inserting into the store immediately (though it would have to be altered slightly to fire after a complete initial list/stream has happened).
This is a small breaking change to the enum, but it is contained to very internal interfaces and can be documented.
Describe alternatives you've considered
flags in watcher::Config to decide whether to bubble up early, but it doesn't avoid the breaking change of introducing a new watcher event for partial data (even if we don't act on it) because it's not #[non_exhaustive]
feature flag in runtime to decide whether watcher::Event has extra features. this feels pretty hairy for an already complex watcher trampoline, and we eventually want the best performance to be the default, rather than hidden behind an opt-in
Documentation, Adoption, Migration Strategy
highlight the change in a release, users who match on the low-level watcher::Event will get a new variant to match on
Target crate for feature
kube-runtime
The text was updated successfully, but these errors were encountered:
It's also been pointed out that this peak allocation might not deallocate at all for the default allocator. Important bits from a discord thread:
Another problem is that the default system allocator never returns the memory to the OS after the burst, even if the objects are dropped. Since the initial list fetch happens sporadically you get a higher RSS usage together with the memory spike. Solving the brust will solve this problem, and reflectors and watchers can be started in parallel without worrying of OOM killers.
The allocator does not return the memory to the OS since it treats it as a cache. This is mitigated by using jemalloc with some tuning, however, you still get the memory burst so our solution was to use jemalloc + start the watchers sequentially. As you can imagine it's not ideal.
What problem are you trying to solve?
As a follow-up to watcher paging/streaming in an effort to reduce allocations and move the complexity to where it needs to be. The main problem here is allocation for initial lists (or initial streaming lists). These are essentially allocated twice internally:
watcher
'sstep_trampolined
Store
'sapply_watcher_event
If we can move the allocation into store and bubble up the event earlier, then we avoid double/triple allocating this, and users who write custom stores can avoid waiting or double allocating.
Note this buffering happens for both listwatch and for streaming lists.
Describe the solution you'd like
We can lift this caching with a
Page<Vec<K>>
orPartial<Vec<K>>
newwatcher::Event
that can be bubbled up to be inserted into the store. Because we now have a ready guard in the store it should be safe to start inserting into the store immediately (though it would have to be altered slightly to fire after a complete initial list/stream has happened).This is a small breaking change to the enum, but it is contained to very internal interfaces and can be documented.
Describe alternatives you've considered
watcher::Config
to decide whether to bubble up early, but it doesn't avoid the breaking change of introducing a new watcher event for partial data (even if we don't act on it) because it's not#[non_exhaustive]
watcher::Event
has extra features. this feels pretty hairy for an already complex watcher trampoline, and we eventually want the best performance to be the default, rather than hidden behind an opt-inDocumentation, Adoption, Migration Strategy
watcher::Event
will get a new variant to match onTarget crate for feature
kube-runtime
The text was updated successfully, but these errors were encountered: