Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add dynamic rest mapper to the admission plugin initializer #16215

Merged
merged 2 commits into from
Sep 12, 2017

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Sep 7, 2017

fixes #16141

picks the dynamic memcache discovery for the RESTMapper and wires it for admission. There's a post-start hook that starts it refreshing.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 7, 2017
@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 7, 2017
aggregatedAPIServer.GenericAPIServer.AddPostStartHookOrDie("admission.openshift.io-RefreshRESTMapper", func(context apiserver.PostStartHookContext) error {
c.RESTMapper.Reset()
go func() {
wait.PollUntil(10*time.Second, func() (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the upstream issues handy where @caesarxuchao noted the performance characteristics of this refresh, but I seem to recall it not being great. Would it be worth at least making the period configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the upstream issues handy where @caesarxuchao noted the performance characteristics of this refresh, but I seem to recall it not being great. Would it be worth at least making the period configurable?

We'll get to see how bad it is. Without rate limiting, I would expect the calls to be quick for discovery.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will performance issues here surface? 10 seconds seems aggressive to me. also, does use of this this tolerate partial discovery failures?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to catch panics since this is in a different go routine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 seconds seems aggressive to me

And arbitrary.

does use of this this tolerate partial discovery failures?

Since we blindly reset the rest mapper, it would be possible to get into a state where GetAPIGroupResources fails and we have no longer have the old rest mapper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will performance issues here surface? 10 seconds seems aggressive to me. also, does use of this this tolerate partial discovery failures?

Better hope so, since its upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 seconds seems aggressive to me. also, does use of this this tolerate partial discovery failures?

Got a preferred time? I don't like waiting, so I tend to prefer tighter times.

And arbitrary.

Any choice would be arbitrary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we blindly reset the rest mapper, it would be possible to get into a state where GetAPIGroupResources fails and we have no longer have the old rest mapper.

Doesn't look like it to me. @ironcladlou's implementation does a refresh on invalidate. If you can't make contact, you keep using the old results. Individual servers which fail are skipped and old results are used if they are present.

how will performance issues here surface?

On the invalidating gofunc (not the admission gofunc), you get lag as it waits. Old results are used for admission. It all seems pretty reasonable to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to catch panics since this is in a different go routine?

I was reasonably sure that the wait.* functions protected the callers, but I'm having trouble finding it.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 7, 2017
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 8, 2017
@deads2k
Copy link
Contributor Author

deads2k commented Sep 8, 2017

/assign liggitt

I think this blocking @spadgett

@pmorie
Copy link
Contributor

pmorie commented Sep 11, 2017

Yeah, this is blocking users of catalog at the moment, so if we could get this in, that would be great

go func() {
wait.Until(func() {
c.RESTMapper.Reset()
}, 10*time.Second, context.StopCh)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it fails well, doesn't crash the process, doesn't affect apiserver response time, and handles partials as well as can be expected. Anyone want to come out with an "I prefer to wait longer, so please change it to value X"?

@liggitt
Copy link
Contributor

liggitt commented Sep 11, 2017

If you can't make contact, you keep using the old results.

works for me

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 11, 2017
@enj
Copy link
Contributor

enj commented Sep 11, 2017

The part that is confusing is the unconditional niling of delegate in https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/staging/src/k8s.io/client-go/discovery/restmapper.go#L186

It makes it appear that getDelegate could fail since it has to rebuild the delegate, but GetAPIGroupResources will never fail since Invalidate will re-use old data if it fails to fetch. So the continuous rebuilding of delegate is not an issue, though possibly wasteful. It could be guarded with Fresh, but that interface does not really work since its not atomic.

/lgtm

@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, enj, liggitt

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue

@openshift-merge-robot openshift-merge-robot merged commit 15f7720 into openshift:master Sep 12, 2017
@deads2k deads2k deleted the gc-01-restmapper branch January 24, 2018 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to bind service to create a secret
8 participants