-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to using Finalizers for ClusterIngress deletion. #3035
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mattmoor The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
if err := c.ensureFinalizer(r); err != nil { | ||
return err | ||
} | ||
|
||
logger.Info("Creating ClusterIngress.") | ||
clusterIngress, err := c.reconcileClusterIngress(ctx, r, resources.MakeClusterIngress(r, traffic)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should drop owner references for existing cluster ingresses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that they are particularly harmful as-is. Do you have some reason to think they are?
I actually (after a whole bunch of debugging!) just found a fun and subtle bug (which is why all these e2e tests are exploding), which ironically is caused by my dropping OwnerReferences (our filter on the Owner's Kind).
cc @tcnghia I think this is the reason for the bizarrely long resync earlier, we filtered out all of the ClusterIngress events! Oops. I'll push a fix for this shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have some reason to think they are?
Currently OwnerReferences
being used to across scopes (namespace,cluster) was undefined causing the K8s GC unable to delete the ClusterIngress
. Breaking foreground cascade deletion and causing deleted objects to be reconciled continuously.
If the undefined behaviour were to change would it have a material impact on us? Looking at the PR further I don't think so (at this time).
3e864c8
to
322acff
Compare
322acff
to
df8de84
Compare
The following is the coverage report on pkg/.
|
/test pull-knative-serving-unit-tests |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold
@tcnghia PTAL in the morning and |
This removes the OwnerReferences from ClusterIngress, which are buggy and unsupported anyways. Instead, we will add a finalizer prior to creating ClusterIngress, which will enable us to hook into Route deletion to clean up the ClusterIngress resources manually. When we see a Route with a DeletionTimestamp, we elide most processing. When we are the first finalizer on the list, it is our turn and we delete the cluster ingresses matching our label selector and remove the finalizer to allow deletion to proceed. Fixes: knative#2570
df8de84
to
ee481ba
Compare
/hold cancel Since the LGTM is removed. |
/lgtm |
I realize we can’t drop owner references in 0.4 since that breaks things when downgrading to 0.3. A ClusterIngress created with 0.4 without an owner will leak during a deletion when running an 0.3 controller. Unless we cherry pick parts of this change to a 0.3 point release and require that version prior to upgrading? |
This removes the OwnerReferences from ClusterIngress, which are buggy and unsupported anyways. Instead, we will add a finalizer prior to creating ClusterIngress, which will enable us to hook into Route deletion to clean up the ClusterIngress resources manually. When we see a Route with a DeletionTimestamp, we elide most processing. When we are the first finalizer on the list, it is our turn and we delete the cluster ingresses matching our label selector and remove the finalizer to allow deletion to proceed. Fixes: knative#2570
This removes the OwnerReferences from ClusterIngress, which are buggy and unsupported anyways.
Instead, we will add a finalizer prior to creating ClusterIngress, which will enable us to hook into Route deletion to clean up the ClusterIngress resources manually.
When we see a Route with a DeletionTimestamp, we elide most processing. When we are the first finalizer on the list, it is our turn and we delete the cluster ingresses matching our label selector and remove the finalizer to allow deletion to proceed.
Fixes: #2570