Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding labels to exported service leads to duplicate EndpointSlice objects #1646

Open
leasonliu opened this issue Sep 25, 2024 · 9 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@leasonliu
Copy link

What happened:
When I added a new label to the exported service of the endpointslice that was managed by lighthouse agent, a new endpointslice with the new label was created, resulting in duplicate endpointslice objects. There was no new EndpointSlices managed by Kubernetes created.

Here are the other scenarios I tested, given there is an existing(old) endpointslice in the cluster
1st scenario,
add a label to service -> new endpointslice is generated
remove old endpointslice
delete the label from the service -> the label is removed from the new endpointslice

2nd scenario,
add a label to service -> new endpointslice is generated
add another label to service -> another new endpointslice is generated
delete the old endpointslice
delete the first label from the service -> the label is removed from the 3rd endpointslice
delete the second label from the service -> nothing happened to the 2 new endpointslices
At last, the 2nd endpointslice has the 1st label and the 3rd endpointslice has the 2nd label. When I restart one of the pods, the new IP does not get updated in both endpointslices, but it is updated in the kubernetes endpointslice.

3rd scenario,
add a label to service -> new endpointslice is generated
add another label to both new and old endpointslice (at this moment, old endpointslice has the 2nd label, new endpointslice has both labels)
delete the label from service -> nothing is updated in both endpointslices
restart a pod -> new IP is updated in kubernetes endpointslice, but not the 2 lighthouse endpointslices
remove the labels from endpointslice, delete a pod -> new IP is updated in the lighthouse endpointslice

What you expected to happen:
Adding labels to exported service should NOT lead to the creation of new EndpointSlice objects. It should just behave like how kubernetes endpointslice controller behaves.
How to reproduce it (as minimally and precisely as possible):
Add a dummy label to exported service, a new endpointslice will be created with the new label.
Anything else we need to know?:
It seems that a different labelset is considered different object by Submariner.
Environment:

  • Diagnose information (use subctl diagnose all):
  • Gather information (use subctl gather):
  • Cloud provider or hardware configuration: AKS, Azure VMs
  • Install tools:
  • Others: submariner version is 0.17.2. And 0.18.0 also has the same issue. kubernetes versions are 1.27.9 (mainly used), 1.30.3, and 1.30.4
@leasonliu leasonliu added the bug Something isn't working label Sep 25, 2024
@tpantelis
Copy link
Contributor

This is b/c we use GenerateName when creating EndpointSlices and use the labels to look up the existing one on update. We transfer all the Service labels to the EndpointSlices (as K8s does) but this causes a new one to be created when the Service labels were changed. We should instead lookup by Name prefix instead (tho probably less effiecient).

@tpantelis tpantelis self-assigned this Sep 25, 2024
@abguptms
Copy link

abguptms commented Oct 2, 2024

@tpantelis, the labels combination of name, namespace and source-cluster should return the existing object right? Is that not expected?

@tpantelis
Copy link
Contributor

@tpantelis, the labels combination of name, namespace and source-cluster should return the existing object right? Is that not expected?

There can be multiple EndpointSlices per cluster. Name prefix isn't correct either - I think we need to annotate the LH EndpointSlice with the UID of the service EndpointSlice and look up by the UID.

@abguptms
Copy link

abguptms commented Oct 2, 2024

I don't follow. The source-name in the lighthouse managed endpointslice is the endpointslice created by endpointslice controller. Since that name will be unique, we should be able to extract a unique lighthouse managed endpointslice and create/edit it. Using UID of the service can actually result in multiple endpointslices returned since default limit on endpointslice endpoints is 100 (unless overridden).

Maybe I am still missing something, but if we watch the endpointslices managed by endpointslice controller and create/edit linked endpointslice by lighthouse, we should always have a 1-1 mapping.

@tpantelis
Copy link
Contributor

The source-name in the lighthouse managed endpointslice is the endpointslice created by endpointslice controller.

Yes we do already have the "lighthouse.submariner.io/source-name" label - I missed that. That's only added for headless services tho it seems like it should also apply to non-headless.

Using UID of the service can actually result in multiple endpointslices ...

I meant the UID of the K8s Endpointslice. Using either name or UID would be fine.

@tpantelis
Copy link
Contributor

tpantelis commented Oct 3, 2024

In looking at this further, for a non-headless (ie ClusterIP) service, there is not a 1:1 mapping with K8s EndpointSlices, ie we only create one EndpointSlice with a single endpoint containing the service IP. We consult all the K8s EndpointSlices to set the LH endpoint's Ready condition. This is why we don't set the "lighthouse.submariner.io/source-name" label for non-headless EndpointSlices. So on non-headless K8s EndpointSlice update (or delete), we need to search for the LH EndpointSlice by only the identifying labels: lighthouse.submariner.io/sourceNamespace, multicluster.kubernetes.io/source-cluster and multicluster.kubernetes.io/service-name.

For a headless service EndpointSlice, we can search by the "lighthouse.submariner.io/source-name" label b/c those do map 1:1 to K8s EndpointSlices.

@abguptms
Copy link

abguptms commented Oct 3, 2024

Yes, that make sense. Thanks for the additional details. Is there a timeline for this fix? We have a few features which require the label propagation and now are blocked due to multiple endpointslices (stale and fresh) in the cluster.

@tpantelis
Copy link
Contributor

Yes, that make sense. Thanks for the additional details. Is there a timeline for this fix? We have a few features which require the label propagation and now are blocked due to multiple endpointslices (stale and fresh) in the cluster.

Unfortunately, I can't provide a specific timeline. It depends on priorities. Right now the devs are focused on completing the 0.19.0 release (which is in RC0). After that we can prioritize this issue. What version are you on? In order to expedite availability, we can backport it and include it in the next patch release for that version.

@abguptms
Copy link

abguptms commented Oct 3, 2024

We are at 0.17.2 right now. Yes, it would be great if the fixes can be patched for backward compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Backlog
Development

No branches or pull requests

3 participants