Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotate Service with LB ID and use for lookup #245

Merged
merged 7 commits into from
Jul 19, 2019

Conversation

timoreimann
Copy link
Contributor

@timoreimann timoreimann commented Jul 17, 2019

This change decouples load-balancer names from their identity, thereby permitting renames that do not impact associations to Services any longer.

Instead of the load-balancer name, we use the immutable load-balancer UUID. As a beneficial side-effect, we can eliminate the local load-balancer cache since the DigitalOcean API allows for direct lookups by UUID in O(1) (vs. O(n) that we previously experienced when iterating through all load-balancers existing in a given account).

Load-balancer IDs are added to their corresponding Service objects as annotations. This happens both on load-balancer creates as well as most retrieval operations (e.g., EnsureLoadBalancer) in order to annotate pre-existing Services as well. When no load-balancer ID can be found, we fall back to doing lookups by name.

Apologies for the somewhat bloated PR, which is mostly due to the test changes required by both leveraging the LB UUID as well as removing the LB cache. Unfortunately, these two should really go together. #243 would be helpful to have, but I did not want to add yet another refactoring in the same PR.

Additional changes this PR makes:

  • Add new getter method retrieveAndAnnotateLoadBalancer to retrieve a load-balancer by ID or name (based on the availability of the LB ID annotation), and patch the associated Service object with the found ID.
  • Use new getter method in {Get,Ensure,Update}LoadBalancer and EnsureLoadBalancerDeleted.
  • Exclude load-balancer activity status check from GetLoadBalancer as this is not the purpose of the method. (It is a pure getter without any logic.)
  • Save on API calls in EnsureLoadBalancer by leveraging the LoadBalancer value returned by the new getter method.
  • Improve clarity/readability of EnsureLoadBalancer by introducing a switch case statement.
  • Remove the load-balancer cache and reintroduce logic around name-based lookups that existed previously.
  • Remove the extra Kubernetes client maintained on the ResourcesController in favor of the client already managed by the resources struct.
  • Update the tagging service to use LB IDs for lookups.
  • Annotate errors.

Additional test-related changes:

  • Extend and refactor tests.
  • Use bare godo.Client for tests instead of godo.NewClient. The former comes with no initialization whatsoever, thus preventing us from running against the real DigitalOcean API during tests accidentally. Additionally, we also fail hard if a used API is not stubbed out.

Fixes #224

This change decouples load-balancer names from their identity, thereby
permitting renames that do not impact associations to Services any
longer.

Instead of the load-balancer name, we use the immutable load-balancer
UUID. As a beneficial side-effect, we can eliminate the local
load-balancer cache since the DigitalOcean API allows for direct lookups
by UUID in O(1) (vs. O(n) that we previously experienced when iterating
through all load-balancers existing in a given account).

Load-balancer IDs are added to their corresponding Service objects as
annotations. This happens both on load-balancer creates as well as most
retrieval operations (e.g., EnsureLoadBalancer) in order to annotate
pre-existing Services as well. When no load-balancer ID can be found, we
fall back to doing lookups by name.

Additional changes this PR makes:

- Add new getter method retrieveAndAnnotateLoadBalancer to retrieve a
  load-balancer by ID or name (based on the availability of the LB ID
  annotation), and patch the associated Service object with the found
  ID.
- Use new getter method in {Get,Ensure,Update}LoadBalancer and
  EnsureLoadBalancerDeleted.
- Exclude load-balancer activity status check from GetLoadBalancer as
  this is not the purpose of the method. (It is a pure getter without
  any logic.)
- Save on API calls in EnsureLoadBalancer by leveraging the LoadBalancer
  value returned by the new getter method.
- Improve clarity/readability of EnsureLoadBalancer by introducing a
  switch case statement.
- Remove the load-balancer cache and reintroduce logic around name-based
  lookups that existed previously.
- Remove the extra Kubernetes client maintained on the
  ResourcesController in favor of the client already managed by the
  resources struct.
- Update the tagging service to use LB IDs for lookups.
- Annotate errors.

Additional test-related changes:

- Extend and refactor tests.
- Use bare godo.Client for tests instead of godo.NewClient. The former
  comes with no initialization whatsoever, thus preventing us from
  running against the real DigitalOcean API during tests accidentally.
  Additionally, we also fail hard if a used API is not stubbed out.
@timoreimann
Copy link
Contributor Author

Another note: while the cloud provider interface mandates that Service objects should not be modified, this actually does not hold in practice anymore: it used to be true when the cloud provider framework still used UPDATE requests to add the external load-balancer IP address (ingress status) to the Service object without handling resource conflicts. These days, it employs PATCH semantics as well.

My tests have shown no problems whatsoever. kubernetes/cloud-provider#27 is touching on the matter as well. (See also my comment there and the following discussion.)

@@ -32,6 +32,10 @@ import (
)

const (
// annoDOLoadBalancerID is the annotation specifying the load-balancer ID
// used to enable fast retrievals of load-balancers from the API by UUID.
annoDOLoadBalancerID = "service.k8s.digitalocean.com/load-balancer-id"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use doks.digitalocean.com/load-balancer-id here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced by kubernetes.digitalocean.com/load-balancer-id as discussed via DM to make this less specific about DOKS. (DIY clusters will use the annotation as well.)

var lb *godo.LoadBalancer
lb, err = l.retrieveAndAnnotateLoadBalancer(ctx, service)
switch {
case err == nil:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use switch err here.


`digitalocean-cloud-controller-manager` attaches the UUID of load-balancers to the corresponding Service objects (given they are of type `LoadBalancer`) using the `service.k8s.digitalocean.com/load-balancer-id` annotation. This serves two purposes:

1. To support load-balancer renames without impacting the correct working of `digitalocean-cloud-controller-manager`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove “ without impacting the correct working of digitalocean-cloud-controller-manager” because ‘support’ means the same thing. Perhaps just ‘To support renaming Load Balancers.’

@@ -75,6 +75,17 @@ The primary purpose of the variable is to allow DigitalOcean customers to easily

When a cluster is created in a non-default VPC for the region, the environment variable `DO_CLUSTER_VPC_ID` must be specified or Load Balancer creation for services will fail.

### Load-balancer ID annotations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We write it as Load Balancers in all our marketing materials, so that’s how we should write it here too.

Copy link
Contributor Author

@timoreimann timoreimann Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much all references in CCM use the load-balancer style right now. If we want to adopt the marketing speech, let's do this in a separate PR.

@@ -75,6 +75,17 @@ The primary purpose of the variable is to allow DigitalOcean customers to easily

When a cluster is created in a non-default VPC for the region, the environment variable `DO_CLUSTER_VPC_ID` must be specified or Load Balancer creation for services will fail.

### Load-balancer ID annotations

`digitalocean-cloud-controller-manager` attaches the UUID of load-balancers to the corresponding Service objects (given they are of type `LoadBalancer`) using the `service.k8s.digitalocean.com/load-balancer-id` annotation. This serves two purposes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider adding a version here so people don’t rename on older versions of CCM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to address this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

1. To support load-balancer renames without impacting the correct working of `digitalocean-cloud-controller-manager`.
2. To efficiently look up load-balancer resources in the DigitalOcean API.

The annotation is added for both newly created and existing load-balancers. However, existing load-balancers lacking the annotation yet can only be associated correctly if they were not renamed before. Otherwise, the renamed load-balancer will be considered missing and a new one will be created instead, leaving the old one dangling.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace this paragraph with

digitalocean-cloud-controller-manager annotates new and existing Services. A Load Balancer that is renamed before the annotation is added gets lost, and a new one will be created.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/added gets lost/added will be lost/


The annotation is added for both newly created and existing load-balancers. However, existing load-balancers lacking the annotation yet can only be associated correctly if they were not renamed before. Otherwise, the renamed load-balancer will be considered missing and a new one will be created instead, leaving the old one dangling.

It is also possible to start owning an unmanaged load-balancer by adding the annotation manually prior to setting the Service type to `LoadBalancer`. Note however every load-balancer should be owned by exactly one `digitalocean-cloud-controller-manager` instance only to prevent concurrent and presumably conflicting modifications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can import an existing Load Balancer by creating the Service annotated with the ID of the Load Balancer. Don’t do this while it’s being managed by another cluster, or the digitalocean-cloud-controller-managers will make conflicting modifications to the Load Balancer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with your suggestion with two exceptions:

  • I think "owning" is clearer than "importing"
  • You can also own/import a load-balancer after the Service has been created as long as the type isn't set to LoadBalancer yet. I think the new wording makes this easy enough to understand now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just call it ‘managing’ instead of owning, because that’s what you call it in the next sentence too.

You can also own/import a load-balancer after the Service has been created as long as the type isn't set to LoadBalancer yet

I think the way you’re wording this is confusing, because it makes it sounds like you need to:

  1. Create a Service
  2. Set the annotation
  3. Change the type to LoadBalancer

While this is possible, it’s a very roundabout way of doing it. You can just create a service of type LoadBalancer with the annotation in one go. Perhaps you could say ‘You [...] manage an existing load-balancer by creating a LoadBalancer Service annotated with the UUID of the load-balancer.’

Also or otherwise is a tautology.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopted your suggestion and rephrased the second sentence to be less tautological.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just dropped "otherwise" in the last sentence.

// lbByName gets a DigitalOcean Load Balancer by name. The returned error will
// be lbNotFound if the load balancer does not exist.
func (l *loadBalancers) lbByName(ctx context.Context, name string) (*godo.LoadBalancer, error) {
lbs, _, err := l.client.LoadBalancers.List(ctx, &godo.ListOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to paginate, like allLoadBalancerList

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Lucky for us, we already had a allLoadBalancerList function, so I just used that.

@jcodybaker
Copy link

jcodybaker commented Jul 17, 2019

I was a little concerned that this would cause problems with kubectl replace since those objects retain the same uuid but are otherwise set exactly to the state of the new object. However, it looks like that's likely to fail for other reasons and is there likely unused in common practice.

$ kubectl apply -f lb.yaml 
service/server created
$ 
$ kubectl annotate service/server 'doks.digitalocean.com/load-balancer-id=1234'
service/server annotated
$ kubectl get service/server -o jsonpath='{.metadata.uid} {"\n"}{.metadata.annotations.doks\.digitalocean\.com/load-balancer-id} {"\n"}'
dbde9d51-a8a4-11e9-bb23-c2cebb59613b 
1234 
$ kubectl replace -f lb.yaml 
The Service "server" is invalid: spec.clusterIP: Invalid value: "": field is immutable
$ kubectl apply -f lb.yaml 
service/server configured
$ kubectl get service/server -o jsonpath='{.metadata.uid} {"\n"}{.metadata.annotations.doks\.digitalocean\.com/load-balancer-id} {"\n"}'
dbde9d51-a8a4-11e9-bb23-c2cebb59613b 
1234

Copy link

@jcodybaker jcodybaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks really good. Nice work. Will 👍 once @bouk 's concerns are addressed.

id = loadBalancerIDsByName[name]
}

if id != "" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could either do an else here or just put a continue at the end of the if id == "" {} block.

Copy link
Contributor Author

@timoreimann timoreimann Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot assume that we have an ID for sure at this point: in the case where the user renamed the LB, loadBalancerIDsByName[name] wouldn't return anything.

I added a comment to make that clearer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Yeah, now I see this. Sorry for the unnecessary comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all -- it pointed out that the reasoning behind this wasn't clear.

1. To support load-balancer renames without impacting the correct working of `digitalocean-cloud-controller-manager`.
2. To efficiently look up load-balancer resources in the DigitalOcean API.

The annotation is added for both newly created and existing load-balancers. However, existing load-balancers lacking the annotation yet can only be associated correctly if they were not renamed before. Otherwise, the renamed load-balancer will be considered missing and a new one will be created instead, leaving the old one dangling.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/added gets lost/added will be lost/

@@ -267,6 +267,7 @@ k8s.io/apiextensions-apiserver/pkg/features
# k8s.io/apimachinery v0.0.0 => k8s.io/apimachinery v0.0.0-20190612205821-1799e75a0719
k8s.io/apimachinery/pkg/labels
k8s.io/apimachinery/pkg/types
k8s.io/apimachinery/pkg/util/strategicpatch

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not actually adding/removing anything from the vendor here. Why did this file change? Do we want to include these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it changed because we started to use the referenced package as a direct dependency now, whereas it was consumed transitively only before.

@bouk bouk self-requested a review July 18, 2019 08:44
@timoreimann
Copy link
Contributor Author

@bouk @jcodybaker addressed all comments. PTAL again.

We overwrite lb in the return call to Update() which means we
cannot reference it in case of an error.
Copy link
Contributor

@bouk bouk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment about the existing load balancer docs, but then LGTM

id = loadBalancerIDsByName[name]
}

if id != "" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Yeah, now I see this. Sorry for the unnecessary comment?

@timoreimann timoreimann merged commit 5befe7d into digitalocean:master Jul 19, 2019
@timoreimann timoreimann deleted the annotate-lb-id branch July 19, 2019 14:13
@@ -8,6 +8,7 @@

### Changed

* Annotate Service objects by load-balancer UUIDs to enable free LB renames and improve the DO API consumption performance (@timoreimann)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed this got merged into the v0.1.16 section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- submitted #249 to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Renaming load balancers causes a new instance to be created leaving the previous one dangling
4 participants