-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flannel-v6.1 MAC address changes every boot #9957
Comments
cc @manuelbuil. This may belong in https://github.com/flannel-io/flannel |
Thanks for reporting this! I don't think this is a regression. Flannel was picking a new mac address for the vxlan interface in each reboot and this was fixed with this PR: flannel-io/flannel#1829. But it seems the user missed to add the same logic to the v6 interface |
Thank you for making that PR, @manuelbuil. Just for my understanding, can you expand on how this isn't a regression? i.e. how have I never experienced this problem? By "problem" I mean the broken flannel network, not the changing MAC. I just assumed the broken network was due to the changing MAC, but it looks like prior to flannel-io/flannel#1829 both interfaces' MAC should have been changing, which means that either I (and everyone else using flannel) should have experienced this issue on every reboot I've done, or I'm missing something. Do you expect flannel to actually be able to handle changing MAC addresses? If so, that functionality appears to have broken somehow. Did k3s change the config to make the interfaces non-learning, perhaps? That might be worth looking into, although once your PR lands it looks like neither interface should be changing anymore. |
My understanding is that before the user's PR, MAC addresses were changing in each reboot. I don't think K3s is changing any default kernel behaviour, so yes, the bug should have been present in K3s. Maybe linux networking components were able to re-learn the new MAC address quickly except in certain environments? It could be a nice investigation to do, I agree |
Huh, how odd. Doesn't really matter I guess, your flannel PR will fix the issue. Any idea when that will be contained in a k3s release? |
It should be included in the May release. We are currently under code freeze for the April release |
Hi @manuelbuil Is there a potential workaround for Canal on RHEL 8.8 until release? A previous issue had recommended updating flannel config to macaddresspolicy to none but not sure if this would work as /etc/systemd/network doesn't exist on my RHEL nodes and this is an RKE2 system running Canal. We are using IPv6 as our primary pod to pod traffic. cat<<'EOF'>/etc/systemd/network/10-flannel.link [Link] |
Yes, you could use the new flannel image once it is ready. We are waiting on one extra PR to be merged in Flannel and then we will release v0.25.2 with the fix |
This sounds almost certainly related to #9807. |
Nope, I'm wrong. I just updated to v1.29.4+k3s1 and this issue persists. Huh, very confusing. Oh well, I'll just hold out for @manuelbuil's fix to be contained in a release. |
|
Right, I meant whether or not #9807 was related, the fix for which was contained in v1.29.4+k3s1. |
No, that is an issue with kube-router's netpol controller. This is a flannel issue. Different components. |
Good point. I'm grasping at straws, obviously :) . I just don't like not understanding what broke here, heh. |
@brandond or @manuelbuil |
The idea is to include it in 1.29, 1.28 and 1.27. Same for RKE2 |
Validated on Version:-$ k3s version v1.30.1+k3s-f2e7c01a (f2e7c01a)
Environment DetailsInfrastructure Node(s) CPU architecture, OS, and Version: Cluster Configuration: Steps to validate the fix
Reproduction Issue:
Validation Results:
|
Environmental Info:
K3s Version:
Node(s) CPU architecture, OS, and Version:
Cluster Configuration:
3 server cluster, dual stack (ipv4 and ipv6). Each node has two NICs, one public, one private. Using flannel with the vxlan backend.
Describe the bug:
Whenever one of my nodes reboots (node A), the
flannel.1
interface's MAC address stays the same, but theflannel-v6.1
interface's MAC address changes. This leads to the flannel network being broken, where both other node's (B and C) believe A to be reachable via its old MAC, but it's not. As a result, node A cannot ping theflannel-v6.1
interfaces on either nodes B or C, or vice-versa (whileflannel.1
pings work just fine).The problem is two-fold:
ip -6 neighbor show | grep <node A's flannel-v6.1 IP>
has the old MAC addressbridge fdb show dev flannel-v6.1 | grep <node A's node-ip>
has the old MAC addressI'm able to work around this with the following whenever a node reboots:
ip -6 neighbor change <flannel-v6.1 IP> dev flannel-v6.1 lladdr <new mac>
bridge fdb add to <new mac> dst <rebooted node's node-ip> dev flannel-v6.1
I've been running k3s for about a year now. This is definitely the first time this has happened. It's been a little while since I rebooted though, and I update k3s whenever an update comes out the stable channel (haven't done the v1.29 update yet due to this issue), so I suspect this regression was introduced recently.
The text was updated successfully, but these errors were encountered: