-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't recreate resources as we're tearing a thing down. #2678
Comments
Your explanation makes the most sense to me. Is this reproducible all the time? If so it should be easy to add some instrumentation to confirm. Happy to take a stab at doing this. serving/pkg/reconciler/v1alpha1/revision/reconcile_resources.go Lines 43 to 56 in 678373d
|
/milestone Serving 0.4 |
/assign @dgerd |
From @mattmoor: "I think the key is that we didn't make |
I added some logging and attempted to reproduce this at HEAD by creating a runLatest Service and then deleting the service. I tried multiple times waiting: immediately after, after a few seconds, and after a few minutes. I have yet to reproduce this. I am using kubectl apply -f to create and kubectl delete -f to delete the service. I will try throwing traffic at it before deleting to see if that changes the behavior at all. Let me know if you have any other reproduction advice. |
So I can see this in the scale test (at least cranked up to 150 as I have it right now), e.g.
Interpreting from the suffixes, these pods came from distinct ReplicaSets, and the main way that would happen would be if we created a second deployment while the pods from the first were still tearing down. |
I added some logic in the revision controller to log when we see Revisions with a |
I'd initially thought this was us doing something obviously wrong, but given that |
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: #2678
When we had finalizers previously we would race with K8s GC to recreate our children as K8s reaped them. The simplest way to test this is to enable "foreground" deletion in our e2e tests, which is implemented as a finalizer. Fixes: knative#2678
We recently started registering
DeleteFunc
handlers for our resources, which enables us to quickly recreate resources if they are deleted from out from under us.A byproduct of this (still needs confirmation) is that I noticed that sometimes when a Revision is being torn down, that right at the moment the
Deployment
s pod startsTerminating
another Pod (under a differentReplicaSet
) appears. At first I thought this was us updating theDeployment
(e.g. #2632), but we are clearly creating twoDeployment
s in our logs:I think what's happening is that we
Reconcile
theRevision
when it has aDeletionTimestamp
(mid delete) and see noDeployment
, so we recreate it.cc @lichuqiang @dgerd
The text was updated successfully, but these errors were encountered: