deploy: set ownerRef from RC to grouped API version #14582

mfojtik · 2017-06-12T09:50:48Z

@smarterclayton @sttts this should fix the deployer gone missing problem (I believe).

Some explanation:

When the GC cache is cold, the GC will try to use a live GET using the dynamic client to check if the resource specified in the ownerRef exists. In this case the RC is pointing to a DC. However, the dynamic client use wrong API prefix (/api instead of /oapi) it this will return 404 and the GC thinks that the DC is gone and mark the RC for deletion. The RC is deleted and since the deployer pod has an ownerRef pointing to this RC the pod is also terminated and deleted.
When the GC caches warm up, it will see the DC (UUID) and so it will not do live GET lookup and keep the RC untouched, allowing successfull rollout. I guess we were just lucky in our CI/CD and we warmed the cache fast enough to not hit this (the ipfailover flaked)...

What this patch does is to force API group version for the DC, which the dynamic client should handle (i can confirm it returns 200).

Fixes: #13995

mfojtik · 2017-06-12T09:51:32Z

@tnozicka @smarterclayton do we need migration for RC that have already legacy API group set or the refManager will patch them again with the correct ref?

mfojtik · 2017-06-12T09:51:46Z

[test]

mfojtik · 2017-06-12T09:57:18Z

To confirm this works:

I0612 11:56:06.641729    4880 wrap.go:42] GET /apis/apps.openshift.io/v1/namespaces/default/deploymentconfigs/docker-registry: (2.215821ms) 200 [[openshift/v1.6.1+5115d708d7 (linux/amd64) kubernetes/010d313/system:serviceaccount:kube-system:generic-garbage-collector] 192.168.64.3:47245]
I0612 11:56:06.641792    4880 wrap.go:42] GET /apis/apps.openshift.io/v1/namespaces/default/deploymentconfigs/router: (2.318231ms) 200 [[openshift/v1.6.1+5115d708d7 (linux/amd64) kubernetes/010d313/system:serviceaccount:kube-system:generic-garbage-collector] 192.168.64.3:47245]

Now the GC client gets the DC successfully.

mfojtik · 2017-06-12T10:06:40Z

@bparees fyi, i guess the dynamic client will be broken for all ownerRefs that build* creates?

mfojtik · 2017-06-12T11:12:49Z

@jwforres @spadgett dunno if this affects web console or not.

tnozicka

LGTM

openshift-bot · 2017-06-12T11:54:02Z

Evaluated for origin test up to 0824d22

deads2k · 2017-06-12T11:55:33Z

lgtm [merge]

deads2k · 2017-06-12T11:56:49Z

@mfojtik let's try to think of a way to make sure that these cause errors, not disasters. Seems like using a RESTMapper without our /oapi types could be a winner.

mfojtik · 2017-06-12T11:59:53Z

@smarterclayton i think this was 85% of the friday's problem, the rest 15% was quorum read which we should address as well.

openshift-bot · 2017-06-12T12:05:34Z

continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 5

openshift-bot · 2017-06-12T12:05:34Z

Evaluated for origin merge up to 0824d22

pweil- · 2017-06-12T12:19:19Z

@jupierce FYI

tnozicka · 2017-06-12T12:20:43Z

I don't think we need migration for this as this was not in any released version, yet. (Also this should kind of migrate itself over time as those old RCs get deleted by mistake new RCs gets created properly, with history loss. This should happen only on start up [because of caches] not causing almost any downtime since containers were not really started yet anyways.)

openshift-bot · 2017-06-12T13:39:54Z

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2129/) (Base Commit: c09c601)

smarterclayton · 2017-06-12T14:16:42Z

Don't we have to do this for builds and everything else?

mfojtik · 2017-06-12T14:17:14Z

@smarterclayton @bparees was already warned

smarterclayton · 2017-06-12T19:09:15Z

Force merging to clear the queue, does not conflict with currently merging PR.

mfojtik requested a review from tnozicka June 12, 2017 09:57

mfojtik added the priority/P0 label Jun 12, 2017

deploy: set ownerRef from RC to grouped API version

0824d22

mfojtik force-pushed the fix-dc-ref branch from 51ca78f to 0824d22 Compare June 12, 2017 10:07

tnozicka approved these changes Jun 12, 2017

View reviewed changes

tnozicka mentioned this pull request Jun 12, 2017

Add tests for RC->DC controllerRef #14416

Closed

mfojtik mentioned this pull request Jun 12, 2017

Garbage collector deleted deployer pod prematurely #13995

Closed

smarterclayton merged commit 496909d into openshift:master Jun 12, 2017

tnozicka mentioned this pull request Jun 14, 2017

Garbage collector will delete objects when wrong GVK is specified in ownerRefs #14646

Closed

mfojtik mentioned this pull request Jul 12, 2017

github.com/openshift/origin/test/end-to-end/core.test/end-to-end/core.sh:112: executing 'oc rollout status dc/docker-registry' expecting success #13943

Closed

mfojtik deleted the fix-dc-ref branch September 5, 2018 21:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deploy: set ownerRef from RC to grouped API version #14582

deploy: set ownerRef from RC to grouped API version #14582

mfojtik commented Jun 12, 2017 •

edited

Loading

mfojtik commented Jun 12, 2017

mfojtik commented Jun 12, 2017

mfojtik commented Jun 12, 2017

mfojtik commented Jun 12, 2017

mfojtik commented Jun 12, 2017

tnozicka left a comment

openshift-bot commented Jun 12, 2017

deads2k commented Jun 12, 2017

deads2k commented Jun 12, 2017

mfojtik commented Jun 12, 2017

openshift-bot commented Jun 12, 2017 •

edited

Loading

openshift-bot commented Jun 12, 2017

pweil- commented Jun 12, 2017

tnozicka commented Jun 12, 2017

openshift-bot commented Jun 12, 2017

smarterclayton commented Jun 12, 2017

mfojtik commented Jun 12, 2017

smarterclayton commented Jun 12, 2017

deploy: set ownerRef from RC to grouped API version #14582

deploy: set ownerRef from RC to grouped API version #14582

Conversation

mfojtik commented Jun 12, 2017 • edited Loading

mfojtik commented Jun 12, 2017

mfojtik commented Jun 12, 2017

mfojtik commented Jun 12, 2017

mfojtik commented Jun 12, 2017

mfojtik commented Jun 12, 2017

tnozicka left a comment

Choose a reason for hiding this comment

openshift-bot commented Jun 12, 2017

deads2k commented Jun 12, 2017

deads2k commented Jun 12, 2017

mfojtik commented Jun 12, 2017

openshift-bot commented Jun 12, 2017 • edited Loading

openshift-bot commented Jun 12, 2017

pweil- commented Jun 12, 2017

tnozicka commented Jun 12, 2017

openshift-bot commented Jun 12, 2017

smarterclayton commented Jun 12, 2017

mfojtik commented Jun 12, 2017

smarterclayton commented Jun 12, 2017

mfojtik commented Jun 12, 2017 •

edited

Loading

openshift-bot commented Jun 12, 2017 •

edited

Loading