-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Use separate cache for partial metadata watches on secrets to include all secrets #10633
🐛 Use separate cache for partial metadata watches on secrets to include all secrets #10633
Conversation
exp/addons/internal/controllers/clusterresourceset_controller.go
Outdated
Show resolved
Hide resolved
exp/addons/internal/controllers/clusterresourceset_controller.go
Outdated
Show resolved
Hide resolved
exp/addons/internal/controllers/clusterresourceset_controller.go
Outdated
Show resolved
Hide resolved
exp/addons/internal/controllers/clusterresourceset_controller.go
Outdated
Show resolved
Hide resolved
exp/addons/internal/controllers/predicates/resource_predicates.go
Outdated
Show resolved
Hide resolved
Very nice! |
9fde0e2
to
4f90185
Compare
@chrischdi can you please check the unit tests? |
/test pull-cluster-api-e2e-main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits. Sorry for the nitpicking, just playing around a bit with generics and trying to find the simplest implementation
Otherwise all good, also tested it and it works perfectly (inspected the caches at runtime)
/test pull-cluster-api-e2e-main |
// secretToExtensionConfigFunc returns a func which maps a secret to ExtensionConfigs with the corresponding | ||
// InjectCAFromSecretAnnotation to reconcile them on updates of the secrets. | ||
func (r *Reconciler) secretToExtensionConfigFunc(ctx context.Context, o *metav1.PartialObjectMetadata) []reconcile.Request { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we revert this (func name + godoc) entirely to what is on main? I think the godoc is not correct anymore (+ the func name is a bit inconsistent now with how we usually call these funcs)
Last nit from my side /assign @fabriziopandini |
/test pull-cluster-api-e2e-main |
Thank you very much! Let's get some additional reviews if possible, just in case I'm missing something |
LGTM label has been added. Git tree hash: fd0b933f00763538f0332835600823f4a8a7933d
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change!
// This way the watch does not use the LabelSelector defined at the cache which | ||
// would filter to secrets having the cluster label, because secrets referred | ||
// by ClusterResourceSet or ExtensionConfig are not specific to a single cluster. | ||
partialSecretCache, err := cache.New(mgr.GetConfig(), cache.Options{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: should this be allSecretCache
instead of partialSecretCache
(nothing in the definition points to partial)
q: is there a way to make sure this cache is used only for Secrets (I think not, but might be we can enforce this with a DefaultTransformerFunc that always returns error)
q: should we use TransformStripManagedFields for secrets? (not necessary, but it doesn't hurt)
cc @sbueringer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the intention why I named it partialSecretCache
is that we tend to only use it for PartialObjectMetadata
watches/objects. Maybe I should add that information to the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of adding a DefaultTransformerFunc and implemented it.
This way we can make sure to not mis-use the cache 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we hit the error cases now? (does the controller fail? are we getting not founds on get? anything else?)
Is the behavior good enough to make sure we never use the cache for the wrong purpose? Or will it just mean that some of our code doesn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it would be better if we just panic if someone tries to use this cache for the wrong gvk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's currently only logging but retrying.
E0604 05:56:24.840511 17 reflector.go:150] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:232: Failed to watch *v1.PartialObjectMetadata: unable to sync list result: couldn't enqueue object: cache expected to only get Secrets, got &TypeMeta{Kind:ConfigMap,APIVersion:v1,}
I'll adjust to do a panic instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it observes the panic, prints the trace and gets stuck (which I think is better than just logging the error)
E0604 05:58:32.951254 62 runtime.go:79] Observed a panic: &errors.errorString{s:"cache expected to only get Secrets, got &TypeMeta{Kind:ConfigMap,APIVersion:v1,}"} (cache expected to only get Secrets, got &TypeMeta{Kind:ConfigMap,APIVersion:v1,})
goroutine 313 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x271ef40, 0x40003a6bf0})
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:75 +0xdc
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x0})
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:49 +0xb4
panic({0x271ef40?, 0x40003a6bf0?})
/Users/schlotterc/.bin/go-archive/go1.22.1.darwin-arm64/src/runtime/panic.go:770 +0xf0
main.setupReconcilers.func1({0x2aaede0, 0x40008f37a0})
/Users/schlotterc/go/src/sigs.k8s.io/cluster-api/main.go:455 +0x2f0
k8s.io/client-go/tools/cache.(*DeltaFIFO).queueActionLocked(0x4000362f20, {0x2aeb978, 0x8}, {0x2aaede0, 0x40008f37a0})
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/tools/cache/delta_fifo.go:456 +0x15c
k8s.io/client-go/tools/cache.(*DeltaFIFO).Replace(0x4000362f20, {0x4000292280, 0x13, 0x13}, {0x40007e99b8, 0x4})
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/tools/cache/delta_fifo.go:641 +0x390
k8s.io/client-go/tools/cache.(*Reflector).syncWith(0x4000988a80, {0x4000292140, 0x13, 0x13}, {0x40007e99b8, 0x4})
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:706 +0x1d0
k8s.io/client-go/tools/cache.(*Reflector).list(0x4000988a80, 0x4000a1ae40)
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:577 +0xe68
k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0x4000988a80, 0x4000a1ae40)
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:353 +0x344
k8s.io/client-go/tools/cache.(*Reflector).Run.func1()
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:298 +0x30
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x40009ede38)
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:226 +0x48
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x4000a59e38, {0x2d710c0, 0x4000290b90}, 0x1, 0x4000a1ae40)
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:227 +0xa0
k8s.io/client-go/tools/cache.(*Reflector).Run(0x4000988a80, 0x4000a1ae40)
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:297 +0x24c
k8s.io/apimachinery/pkg/util/wait.(*Group).StartWithChannel.func1()
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:55 +0x34
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:72 +0xa8
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start in goroutine 305
/Users/schlotterc/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:70 +0xc4
2024-06-04T05:58:32Z error layer=rpc writing response:write tcp [::1]:30000->[::1]:46760: use of closed network connection
█
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx!
53006e4
to
1dd1d9e
Compare
/test pull-cluster-api-e2e-main |
Cosmetics: /override pull-cluster-api-apidiff-main |
@chrischdi: chrischdi unauthorized: /override is restricted to Repo administrators. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@chrischdi: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Thank you! /lgtm /assign @fabriziopandini |
LGTM label has been added. Git tree hash: bcdf8982b46ba4accb1c3f9268684e250b1639af
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vincepri The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
This PR introduces a separate cache which is used in the clusterresourceset_controller for watching secrets.
Previously the
WatchesMetadata
for secrets inclusterresourcesset_controller
did inherit the LabelSelector configured inmain.go
:https://github.com/kubernetes-sigs/cluster-api/blob/main/main.go#L322-L329
This label selector gets passed through in controller-runtime for the informer which gets created for the watch.
Secrets for clusterresourcesets may apply for multiple clusters, so the label selector may not even exist at the secrets referred by clusterresourcesets.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #10557
/area clusterresourceset