-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flannel-external-ip is ignored in cloud environments? #10295
Comments
As the docs say:
If you disable the built-in cloud-controller, K3s no longer has a native integration point to set the external IPs. This is instead handled by whatever infrastructure provider specific cloud controller you deploy. Since those are not integrated into K3s's embedded flannel, you'll need to manually set any additional annotations necessary to inform Flannel about those IPs. |
To be clear, you're trying to use this to manage a hybrid deployment? You have some nodes that you want to be managed by the AWS CCM, and have the node and flannel use the external IPs set by that CCM, while other nodes are managed by the K3s CCM, and have the node and flannel use the external IPs set by the --node-external-ip flag? This sort of thing isn't really allowed for by the cloud provider model, it is generally expected that all nodes in the cluster will be managed by the same CCM. It would take some additional work to make K3s set the flannel external IP annotations based on the node external IP provided by another CCM, and ensure that flannel starts up AFTER that CCM has already had a chance to initialize the node. I don't even know how the AWS CCM will handle presence of non-aws nodes in the cluster. |
cc @manuelbuil I think this would require
|
Thanks for looking at this seriously. It would help us significantly, as described below:
Actually our setup is somewhat simpler: We have a cluster on AWS, to which we want to add remote nodes (k3s agents) which run on local workstations. Basically, we like to "federate" the cluster to local workstations. This works well enough, using bootstrap tokens, tightly controlled Flannel configuration (=which is where the current issue pops up), etc. We're even quite a few steps towards running the local k3s agent in a rootless setup. Just for completion: our aws-cloud-controller-manager is configured to not do any network configuration inside the cluster:
I'm not familiar enough with the peculiarities, but setting the IP annotations based on the CLI arguments, given during startup, seems independent from whether there is an external CCM or not? Or am I'm fully missing the point here?
This seems like a good change, these annotations are very Flannel-specific by nature. For the short term we've solved this issue by manually deploying Flannel as an CNI-plugin daemonset, as described at: https://github.com/flannel-io/flannel This ensures the node is started before Flannel initializes. However, passing the correct External-IP address to that setup is also non-trivial, especially in a rootless configuration. Just having these CLI arguments working would simplify our setup significantly. |
How exactly are you accomplishing that? All the CNI-related stuff seemed pretty broken last time I tried to get it working rootless.
Even if K3s did set the annotations for you when you're not using our cloud-provider, you'll still need to properly set the external IPs for each node, right? |
Yes, it's a bit of a mess. Basically that's the next issue to tackle: in the current rootless client options, k3s hardcodes the list of copy-up dirs, which is missing /opt/cni as an entry. Which means that the kube-flannel based CNI plugin can't create that folder. So we moved the: cni-bin-dir to /run/opt/cni/bin. But as you know that's another set of annoying configuration changes that need to be done, both on the containerd and kubelet side.
In effect, the external IPs are a given by the host you're running K3s on. In the case of the AWS servers, they are part of the EC2 setup, and can be obtained from inside the host. Similarly, at the remote k3s agent's hosts, we can just determine the Public IP addresses as a pre-given. Effectively we tell K3s what the external-ip is, expecting K3s to just sec pass this along to Flannel (and Flannel would pass it along to WireGuard). An alternative setup would be to manually (outside K3s) setup a WireGuard network, and tell Flannel to directly use that pre-configured network. But I really like the dynamic setup of Wireguard that Flannel provides out-of-the-box. |
I don't think so either
Yes, very likely. The |
By reading my issue #6177, I can confirm that we decided to set them as part of the cloud provider so that they are ready before flanneld is started |
Although I still believe that moving these annotations is a good enhancement, our use-case has weakened a bit: We've decided to not use the AWS-CCM, but to revert back to the k3s-inbuilt-CCM. That AWS version wasn't actually doing anything anymore for us, given that we pre-set the ipaddresses already and were not using "cloud-routes", etc. The AWS-CCM was actually working against our use-case, as it introduced a race-condition where the CCM would remove our remote nodes before they could become Ready. So, as a consequence, the Flannel-external-ip flag is currently not a problem for us anymore. However, if this issue is worked on and fixed, I will still be able to provide test results and feedback. Just as a side-note (and if you like I can try a full write-up of how we achieved this): |
Yeah, that sort of thing is what I've seen in the past, and was what I was alluding to with
|
The |
Environmental Info:
K3s Version:
Currently running v1.28.5+k3s1, however the relevant code sample below is from the main branch, so this effects all versions over the last few years.
Node(s) CPU architecture, OS, and Version:
On AWS EC2 instances:
Linux host-6f82bbb7-64bd-495a-87f6-d6256171dac6 5.15.117-flatcar #1 SMP Tue Jul 4 14:43:38 -00 2023 x86_64 Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz GenuineIntel GNU/Linux
Cluster Configuration:
Describe the bug:
As documented, there is a Flannel-external-ip flag available to the k3s configuration, that informs the flannel backend to use the ipaddress as provided by node-external-ip config option. However, this flag is ignored if k3s is configured to use an external cloud provider. As shown in:
k3s/pkg/agent/run.go
Line 382 in df5db28
This disallows several use-cases on cloud hosted deployments (e.g. on AWS EC2 hosts as in our case) where some agent nodes are located on user-premises and/or cross-cloud setups.
Steps To Reproduce:
`
flannel-backend: "wireguard-native"
egress-selector-mode: cluster
disable-cloud-controller: true
disable:
kube-controller-manager-arg:
kubelet-arg:
kube-apiserver-arg:
node-external-ip: 3.72.94.253
flannel-external-ip: true
`
Expected behavior:
I expected the wireguard configuration of flannel to use the public address ("3.72.94.253" in this case) Allowing flannel traffic over the Internet, using wire level security.
Actual behavior:
The actual wireguard configuration still uses the local, private ipaddress as provided by the AWS EC2 server's network interface. This makes routing the flannel traffic over the Internet impossible and insecure.
Additional context / logs:
Wireguard config using the incorrect config:
`
interface: flannel-wg
public key: (hidden)
private key: (hidden)
listening port: 51820
peer: PPC1sy2btO8Ihs673FaxPQaFxYbEsMKIM0Oa6gC0TkA=
endpoint: 172.16.122.59:51820
allowed ips: 10.42.2.0/24
latest handshake: 46 seconds ago
transfer: 2.48 MiB received, 10.37 MiB sent
persistent keepalive: every 25 seconds
peer: qWFvdE2fovfNlCEceP5jASeBTBJSBBSr3DuBMCbPj2o=
endpoint: 172.16.121.41:51820
allowed ips: 10.42.1.0/24
latest handshake: 1 minute, 42 seconds ago
transfer: 210.50 MiB received, 1.41 GiB sent
persistent keepalive: every 25 seconds
`
The text was updated successfully, but these errors were encountered: