Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: bind HostDNS to 169.254.x link-local address #9200

Merged
merged 1 commit into from
Aug 19, 2024

Conversation

smira
Copy link
Member

@smira smira commented Aug 19, 2024

This is an attempt to fix many issues related with trying to use Service IP for host DNS.

Fixes #9196

@smira smira added this to the v1.8 milestone Aug 19, 2024
@smira
Copy link
Member Author

smira commented Aug 19, 2024

Idea comes from @utkuozdemir

@smira

This comment was marked as resolved.


// HostDNSAddress is the address of the host DNS server.
//
// Note: 116 = 't' and 108 = 'l' in ASCII.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

hack/release.toml Outdated Show resolved Hide resolved
This is an attempt to fix many issues related with trying to use Service
IP for host DNS.

Fixes siderolabs#9196

Signed-off-by: Andrey Smirnov <[email protected]>
@smira
Copy link
Member Author

smira commented Aug 19, 2024

/m

@talos-bot talos-bot merged commit ee4290f into siderolabs:main Aug 19, 2024
50 checks passed
@maxpain
Copy link
Contributor

maxpain commented Sep 10, 2024

Does this work with Cilium?

@smira
Copy link
Member Author

smira commented Sep 10, 2024

Does this work with Cilium?

you can tell us! previous approach worked with Cilium in default config, failed with non-default settings on Cilium side

@maxpain
Copy link
Contributor

maxpain commented Oct 1, 2024

@smira 169.254.116.108 isn't pingable from the pod's network when enabling eBPF masquerading.
coredns pods don't work because of this

bpf:
  masquerade: true

@maxpain
Copy link
Contributor

maxpain commented Oct 1, 2024

I even tried to disable masquerading for link-local addresses using following configuration:

bpf:
  masquerade: true

ipMasqAgent:
  enabled: true
  config:
    masqLinkLocal: false
root@w1:/home/cilium# cilium-dbg bpf ipmasq list
IP PREFIX/ADDRESS   
169.254.0.0/16   

But it didn't help.

Any ideas?

@smira
Copy link
Member Author

smira commented Oct 1, 2024

You should ask Cilium guys, it works with Cilium with/without kube-proxy, but there are too many Cilium configuration options.

@dhess
Copy link

dhess commented Oct 15, 2024

After upgrading my Cilium v1.7.x cluster to v1.8.1, the default DNS configuration is completely broken, presumably due to this issue.

It would be helpful to add a warning to the Talos docs for users running Cilium.

@smira
Copy link
Member Author

smira commented Oct 15, 2024

This configuration works with Cilium using defaults (we have it tested), but it might not work with some non-default Cilium configuration, so it's better to find the issue.

@Nomsplease
Copy link

Nomsplease commented Oct 29, 2024

@smira can you add the "defaults" you are referring to while testing this deployment of Cilium? Per your docs I have replicated your documented defaults on my Talos 1.8.2 cluster with Cilium 1.16.3, and I get i/o timeouts from the CoreDNS pods. So something was clearly broken when making this change.

Specifically [coredns-68d75fd545-vx4qp] [ERROR] plugin/errors: 2 s3.TLD. AAAA: read udp 10.244.2.8:53228->169.254.116.108:53: i/o timeout

These are the helm values the pod was deployed with, they do not work.

cgroup:
  autoMount:
    enabled: false
  hostRoot: /sys/fs/cgroup
cluster:
  id: 1
  name: main
cni:
  exclusive: false
devices: br+
ipam:
  mode: "kubernetes"
ipv4NativeRoutingCIDR: 10.244.0.0/16
k8sServiceHost: 127.0.0.1
k8sServicePort: 7445
kubeProxyReplacement: true
securityContext:
  capabilities:
    ciliumAgent:
      - CHOWN
      - KILL
      - NET_ADMIN
      - NET_RAW
      - IPC_LOCK
      - SYS_ADMIN
      - SYS_RESOURCE
      - DAC_OVERRIDE
      - FOWNER
      - SETGID
      - SETUID
    cleanCiliumState:
      - NET_ADMIN
      - SYS_ADMIN
      - SYS_RESOURCE

@maxpain
Copy link
Contributor

maxpain commented Oct 29, 2024

@Nomsplease have you restarted the coredns pods after deploying Cilium?

@Nomsplease
Copy link

@Nomsplease have you restarted the coredns pods after deploying Cilium?

Multiple times, even uninstalled my CoreDNS I had seperate and moved back to the Talos resources. Reapplied everything with k8s upgrade command. This clearly broke with 1.8.X, and I have been pulling my hair out trying to fix it.

@maxpain
Copy link
Contributor

maxpain commented Oct 29, 2024

@Nomsplease Are you sure you are not using bpf.masquerade=true in the cilium helm values?
I faced the same issue and decided to disable forwardKubeDNSToHost. Reusing the DNS cache between the core pods and the host does not offer much benefit.

@Nomsplease
Copy link

@Nomsplease Are you sure you are not using bpf.masquerade=true in the cilium helm values? I faced the same issue and decided to disable forwardKubeDNSToHost. Reusing the DNS cache between the core pods and the host does not offer much benefit.

I saw your previous issue, hence what led me here. My helm values are above. My previous values included bpf.masquerade, but I was trying to rule out variables hence strip everything to the "defaults". Maybe ill just turn off the DNS to Host since clearly something wasnt tested, or was missed. Cilium guys dont seem to want to look into it.

@smira
Copy link
Member Author

smira commented Oct 29, 2024

@smira can you add the "defaults" you are referring to while testing this deployment of Cilium? Per your docs I have replicated your documented defaults on my Talos 1.8.2 cluster with Cilium 1.16.3, and I get i/o timeouts from the CoreDNS pods. So something was clearly broken when making this change.

You can see yourself in the CI: https://github.com/siderolabs/talos/actions/runs/11566835784/job/32196143544

@rkerno
Copy link

rkerno commented Nov 7, 2024

I've just stumbled across this issue. The configuration below works until I switch to bpf.hostLegacyRouting=false. I'm restoring Talos from backups taken before the CNI is installed to test all Cilium configuration options...it's the only way I can be sure that the configuration is applied correctly without side-effects from a previous configuration. I'm going to test the cilium ipMasqAgent settings now that @maxpain mentioned above, and will report back with findings.

helm template cilium cilium/cilium \
	--version 1.16.3 \
	--namespace kube-system \
	--set securityContext.capabilities.ciliumAgent="{CHOWN,KILL,NET_ADMIN,NET_RAW,IPC_LOCK,SYS_ADMIN,SYS_RESOURCE,DAC_OVERRIDE,FOWNER,SETGID,SETUID}" \
	--set securityContext.capabilities.cleanCiliumState="{NET_ADMIN,SYS_ADMIN,SYS_RESOURCE}" \
	--set cgroup.autoMount.enabled=false \
	--set cgroup.hostRoot=/sys/fs/cgroup \
	--set kubeProxyReplacement=true \
	--set k8sServiceHost=127.0.0.1 \
	--set k8sServicePort=7445 \
	--set ipv4.enabled=true \
	--set ipv4NativeRoutingCIDR="10.244.0.0/16" \
	--set ipam.operator.clusterPoolIPv4PodCIDRList="10.244.0.0/16" \
	--set devices="eth0 eth1" \
	--set routingMode=native \
	--set autoDirectNodeRoutes=true \
	--set bpf.masquerade=true \
	--set bpf.hostLegacyRouting=true \
	--set bpf.datapathMode=netkit \
	--set enableIPv4Masquerade=true \
	--set hubble.enabled=true \
	--set hubble.relay.enabled=true \
	--set hubble.ui.enabled=true \
	> cilium.yaml

@rkerno
Copy link

rkerno commented Nov 7, 2024

I think this is related to this cilium issue: cilium/cilium#29413

The recommendation is to move the host's DNS from the loopback device to the dummy device when using bpf routing. I've had a quick look and didn't see any Talos configuration options to achieve this, so I guess if I want bpf routing I need to disable Talos' forwardKubeDNSToHost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DNS resolution seems broken with nftables and forwardKubeDNSToHost in development version
8 participants