Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track underlying cloud resources for Services #27

Open
andrewsykim opened this issue Apr 4, 2019 · 17 comments
Open

Track underlying cloud resources for Services #27

andrewsykim opened this issue Apr 4, 2019 · 17 comments
Assignees
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. P2 Priority 2
Milestone

Comments

@andrewsykim
Copy link
Member

andrewsykim commented Apr 4, 2019

SIG CP backlog/tracking issue for kubernetes/kubernetes#70159

Following Tim's comment we also need to have a discussion on whether we want to document this as a best practice or enforce this across existing providers.

@andrewsykim
Copy link
Member Author

cc @aoxn

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 3, 2019
@timoreimann
Copy link
Contributor

@andrewsykim this issue and the originating one in k/k seem to concern the very point I raised the other day on Slack about attaching DigitalOcean load-balancer UUIDs as annotations. I'm currently PoC'ing if this is possible to do with the latest cloud provider framework (i.e., the Kubernetes 1.15 dependencies), especially with regards to Service objects being modified not only by the service controller. (kubernetes/kubernetes#70159 (comment) touches on the same topic, albeit at a time when PATCH semantics were not yet in use.)

@MrHohn
Copy link
Member

MrHohn commented Jul 3, 2019

especially with regards to Service objects being modified not only by the service controller.

(edit: Sorry for hitting enter too soon :/)

I was thinking about this as well. With PATCH now being used for updating Service status / metadata, we shouldn't be hitting conflict issue if underlying cloud provider implementation updates the annotations in between. Though it seems to oppose what the interface originally defines:

cloud-provider/cloud.go

Lines 117 to 121 in 4025669

// EnsureLoadBalancer creates a new load balancer 'name', or updates the existing one. Returns the status of the balancer
// Implementations must treat the *v1.Service and *v1.Node
// parameters as read-only and not modify them.
// Parameter 'clusterName' is the name of the cluster as presented to kube-controller-manager
EnsureLoadBalancer(ctx context.Context, clusterName string, service *v1.Service, nodes []*v1.Node) (*v1.LoadBalancerStatus, error)

Maybe relaxing what the interface defines is the easiest way, however it feels somehow fragile without a generic mechanism to enforce the behavior.

@andrewsykim
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 8, 2019
@timoreimann
Copy link
Contributor

timoreimann commented Jul 10, 2019

My PoC to validate that patching the Service object does not interfere with the service controller's own operations and patching was successful -- I couldn't spot any error logs or changing behaviors as the LBs came up. I think the patching of mine triggers another service update, but that's not too much of a concern.

One issue I'm still somewhat struggling with is that I'd like the patching to occur as my cloud provider methods are executed (i.e., during EnsureLoadBalancer) which means I need to inject a Kubernetes clientset. The Initialize interface method already passes on a cloudprovider.ControllerClientBuilder to derive the client from. However, the client should ideally (at least that's what I think) go into the cloud implementation that's being returned as a callback to cloudprovider.RegisterCloudProvider. I currently manage to do this by setting a field on my cloud struct during Initialize, but that feels kinda dirty and relies on the current cloud provider initialization order (that is, Initialize is called prior to the service controllers starting).

I could also set up a controller of my own, but that seems like a lot of extra work I don't really need. Maybe we can extend the Register callback signature to pass along the necessary configuration settings?

Curious if anyone has any ideas and what folks think. (@MrHohn, did you possibly explore a similar direction?)

@andrewsykim
Copy link
Member Author

@aoxn do we want to work on this for v1.16?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 31, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 30, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andrewsykim
Copy link
Member Author

I actually think it's okay to close this now. Service w/ Type=LB now have finalizers so cloud providers should be able to find and delete any associated resource without worrying about the service resource no longer existing.

@timoreimann
Copy link
Contributor

@andrewsykim my understanding was that the tracking issue kubernetes/kubernetes#70159 had a different focus, which is to track a non-ambiguous ID for the cloud LB. Right now the only identifier that the Cloud Provider interface provides is the name; however, the name can be subject to change (even on purpose to support LB renaming), so a different means seems necessary.

The finalizer seems to ensure that resources get deleted eventually, but does not guarantee that I am able to reference the cloud LB easily / efficiently / non-ambiguously at any point during its lifetime. That's why we started to track the LB ID as an annotation on the Service at DigitalOcean but that comes with a few restrictions. (See my comment #27 (comment) above.)

@andrewsykim
Copy link
Member Author

that makes sense

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Mar 23, 2020
@k8s-ci-robot
Copy link
Contributor

@andrewsykim: Reopened this issue.

In response to this:

that makes sense

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andrewsykim
Copy link
Member Author

/remove-lifecycle rotten
/lifecycle frozen

@timoreimann let me know if you can work on this

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Apr 15, 2020
@timoreimann
Copy link
Contributor

timoreimann commented Apr 28, 2020

@andrewsykim I can put together a PR as a basis for further discussion. We've been "violating" the do-not-modify-services-directly advice mentioned in the GoDoc for quite some time in DigitalOcean's CCM in order to associate the LB ID with the Service. Having a "blessed" approach for doing so would be beneficial from my point of view.

I'm not sure if I can make it for 1.19 though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. P2 Priority 2
Projects
None yet
Development

No branches or pull requests

6 participants