Garbage collector will delete objects when wrong GVK is specified in ownerRefs #14646

tnozicka · 2017-06-14T17:35:22Z

Follow up on #14582

Upstream garbage collector already checks for and ignores objects it doesn't know.

What happens in OpenShift is that if you specify invalid resource it works the same way. But if you specify a valid OpenShift resource with API version (without group) and GC has a cold cache, it will delete it as an orphan. That's because RESTMapper won't raise an error but dynamic client hits url with /api instead of /oapi and get 404 as if the resource wouldn't exist thus deletes the "orphan". This will cause data loss. Even if we fix all the controllers to create a valid controllerRefs and tested it properly, user can still manually (or with some automation of his own) create an ownerReference and such object would be incorrectly deleted resulting in data loss.

It seems like we could switch RESTMapper for garbage collector to fix it. (See @deads2k comment)

deads2k · 2017-06-14T17:41:44Z

It seems like we could switch RESTMapper for garbage collector to fix it. (See @deads2k comment)

To be clear, the fix in this case means to fail harder and earlier, not actually tolerate such owner references.

tnozicka · 2017-06-15T11:48:54Z

Agreed. (To expand on that: it doesn't fail with an error at all at this point which is the biggest issue - it just deletes data incorrectly.)

tnozicka · 2017-06-15T11:49:18Z

cc: @mfojtik

deads2k · 2017-06-16T12:17:20Z

I don't think this is a blocker for 3.6. The ownerrefs are managed in code, we just have to have them set them correctly.

tnozicka · 2017-06-16T12:44:55Z

@deads2k For our objects, yes. For DC<-RC<-Po we set only 1 cotrollerRef in Po+RC. Users could set other owner references manually as well to achieve cascade deletion. Or have some scripts/controllers in place to do that.

ControllerRef is a special kind of OwnerRef (and a singleton) that we control and we can make sure to set it up correctly. OwnerRefs in general can be set up by anyone I think. Like you want to tie a lifetime of your DC to an IS or another DC or ... User can do that. I am aware that the probability of him doing that is quite low but it still results in data loss.

deads2k · 2017-06-16T13:02:11Z

ControllerRef is a special kind of OwnerRef (and a singleton) that we control and we can make sure to set it up correctly. OwnerRefs in general can be set up by anyone I think. Like you want to tie a lifetime of your DC to an IS or another DC or ... User can do that. I am aware that the probability of him doing that is quite low but it still results in data loss.

I'm not going to claim it's ideal, but performing an advanced operation incorrectly (and there is a very narrow path to doing it correctly) behaving consistently (every other mistake also results in deletion), ought not block a release.

tnozicka · 2017-06-16T13:11:07Z

@deads2k I am fine with letting it slip 1.6.0; just wanted to point out it's not only about our controllers

openshift-bot · 2018-02-12T10:35:45Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

tnozicka · 2018-02-12T12:09:30Z

/lifecycle frozen

tnozicka added the priority/P1 label Jun 14, 2017

tnozicka added the kind/bug Categorizes issue or PR as related to a bug. label Jun 15, 2017

tnozicka mentioned this issue Jun 15, 2017

Add tests for RC->DC controllerRef #14416

Closed

tnozicka added the component/kubernetes label Jun 15, 2017

pweil- assigned childsb Jun 15, 2017

pweil- added component/kubernetes and removed component/kubernetes labels Jun 15, 2017

pweil- assigned deads2k and unassigned childsb Jun 15, 2017

deads2k added priority/P2 and removed priority/P1 labels Jun 16, 2017

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 12, 2018

openshift-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Feb 12, 2018

tnozicka closed this as completed Mar 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Garbage collector will delete objects when wrong GVK is specified in ownerRefs #14646

Garbage collector will delete objects when wrong GVK is specified in ownerRefs #14646

tnozicka commented Jun 14, 2017

deads2k commented Jun 14, 2017

tnozicka commented Jun 15, 2017

tnozicka commented Jun 15, 2017

deads2k commented Jun 16, 2017

tnozicka commented Jun 16, 2017

deads2k commented Jun 16, 2017

tnozicka commented Jun 16, 2017 •

edited

Loading

openshift-bot commented Feb 12, 2018

tnozicka commented Feb 12, 2018

Garbage collector will delete objects when wrong GVK is specified in ownerRefs #14646

Garbage collector will delete objects when wrong GVK is specified in ownerRefs #14646

Comments

tnozicka commented Jun 14, 2017

deads2k commented Jun 14, 2017

tnozicka commented Jun 15, 2017

tnozicka commented Jun 15, 2017

deads2k commented Jun 16, 2017

tnozicka commented Jun 16, 2017

deads2k commented Jun 16, 2017

tnozicka commented Jun 16, 2017 • edited Loading

openshift-bot commented Feb 12, 2018

tnozicka commented Feb 12, 2018

tnozicka commented Jun 16, 2017 •

edited

Loading