Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sync-catalog] Fix NodePort register service with wrong internal IP when multiple internal IPs are reported on the node #619

Merged
merged 1 commit into from
Aug 19, 2021

Conversation

kolorful
Copy link
Contributor

@kolorful kolorful commented Aug 17, 2021

This PR fixed an issue where consul-sync will register NodePort services with wrong internal IPs when multiple internal IPs are reported on the node, and this is a common pattern when using the AWS CNI plugin.

When using the AWS CNI plugin, multiple ENIs could be used and the secondary ENI with multiple IPs are not used by the node itself, but assigned to individual pods. In this case kubelet will end up reporting all used IPs on all ENIs as internal IPs on the node. (Detailed explanation)

Kubernetes has since made sure the first IP would be equal to the IP on eth0 (reference 2) and it would be the actual IP of the Node.

Prior to my fix the consul-sync program always sync the last matching IP to consul and it ends up being a Pod IP on that node. This PR fixed it by always picking the first matching IP as how Kubernetes is implemented.

Changes proposed in this PR:

  • Use the first found Node IP address when syncing NodePort service to consul

How I've tested this PR:

  • make test
  • Built a docker image with my change then tested in our AWS cluster that has multiple ENIs and verify it now syncs the correct Node IP address for NodePort service.

How I expect reviewers to test this PR:

  • Run test suite should be good enough since I've already tested in a real cluster.

Checklist:

  • Bats tests added
  • CHANGELOG entry added (HashiCorp engineers only, community PRs should not add a changelog entry)

@hashicorp-cla
Copy link

hashicorp-cla commented Aug 17, 2021

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@lkysow lkysow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good, just one suggestion for comments.

control-plane/catalog/to-consul/resource.go Show resolved Hide resolved
control-plane/catalog/to-consul/resource.go Show resolved Hide resolved
@kolorful
Copy link
Contributor Author

kolorful commented Aug 17, 2021

@lkysow thanks for reviewing the PR! I've merged in your suggestion, squashed the commit and rebased master.

@lkysow lkysow merged commit 3787573 into hashicorp:master Aug 19, 2021
lkysow added a commit that referenced this pull request Aug 19, 2021
lkysow added a commit that referenced this pull request Aug 19, 2021
lawliet89 pushed a commit to lawliet89/consul-k8s that referenced this pull request Sep 13, 2021
@kolorful
Copy link
Contributor Author

kolorful commented Oct 1, 2021

Hi @lkysow, is it possible to backport this change to 0.25, 0.26, 0.33 because we are using consul:1.9.5 + consul-k8s 0.25 and consul 1.10 introduced a backward incompatible change which caused catalog-sync to fail on errors like:

2021-10-01T18:33:37.897Z [WARN]  to-consul/sink: error registering service: node-name=k8s-sync service-name=tstsvc service="&{ tstsvc-b48e46ea737f tstsvc [awsnrt2] map[external-k8s-ns:tstsvc external-source:kubernetes port-http:80 port-https:443] 30080 10.135.200.155  map[] {0 0} false 0 0  <nil> <nil>   }" err="Unexpected response code: 400 (Request decode failed: json: unknown field "SocketPath")"

Looks like consul-k8s upgraded to consul 1.10 in 0.34

@lkysow
Copy link
Member

lkysow commented Oct 5, 2021

Hi @kolorful are you not able to upgrade to consul 1.10 and consul-k8s latest? Or is there an issue with those versions together?

@kolorful
Copy link
Contributor Author

kolorful commented Oct 5, 2021

Hi @lkysow we plan to, but it would take some time before we can get there. If back-port is as easy as running a script for you, it would be really appreciated so we don't need to maintain a fork for the time being. Thank you :)

@lkysow
Copy link
Member

lkysow commented Oct 5, 2021

Ahh I see. We actually don't do any backport releases of consul-k8s right now so it's unfortunately not as easy as running a script.

@kolorful
Copy link
Contributor Author

kolorful commented Oct 5, 2021

I see, that's fine, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants