Cause of "Captured frame from MAC ... associated with another peer" #3778

alok87 · 2020-02-27T09:10:13Z

Creating this based on #2877 (comment)

What you expected to happen?

When a new node joins the cluster, any existing node should not become unroutable.

What happened?

Today we got one more unroutable alert for one of the kubernetes node (10.2.20.238). We saw that node became unroutable just after a new node 10.2.20.227 joined the cluster.

When i say healthy or routable i mean curl node_ip:node_port/endpoint has started working

Events
I0227 05:39:49 > 10.2.20.227 node add event

I0227 05:40:23 > 10.2.20.238 node unhealthy event 👎 (continuously unhealthy till 05:51:54)
I0227 05:50:25 > 10.2.20.227 became healthy for the first time(routable) 👍
I0227 05:51:54 > 10.2.20.238 got healthy 👍
I0227 06:01:15 > 10.2.20.227 node delete event

How to reproduce it?

Not sure. May be if the node with this ip joins again in the current network, this can be reproduced again. I am keeping an eye. I will update it here when i see a pattern or able to reproduce it.

Anything else we need to know?

kops1.15.0 made cluster.

Versions:

$ weave version
2.6.0
$ docker version
18.06.3-ce
$ uname -a
Linux ip-10-2-21-229 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20) x86_64 GNU/Linux
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.5", GitCommit:"20c265fef0741dd71a66480e35bd69f18351daea", GitTreeState:"clean", BuildDate:"2019-10-15T19:16:51Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.5", GitCommit:"20c265fef0741dd71a66480e35bd69f18351daea", GitTreeState:"clean", BuildDate:"2019-10-15T19:07:57Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}

MTU Setting

admin@ip-10-2-20-238:~$ sudo ifconfig| grep -i MTU | grep -v veth
datapath: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 8912

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
vxlan-6784: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 65485
weave: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 8912

Logs:

Weave logs of 10.2.20.238 which got unhealthy
https://gist.github.com/alok87/5b99d5b07b01306c5f1f34c3eb0f1025

If you check the weave log of 10.2.20.238 ^
The weave log is filled up with the Captured frame from MAC issues after 10.2.20.227 joined the cluster and this 10.2.20.238 was continuosly unhealthy after that.

cat weave-net-4zggw-238.log | grep "2020/02/27 05:4" | grep  "Captured frame from MAC" | grep 227 | wc -l
541

cat weave-net-4zggw-238.log | grep "2020/02/27 05:2" | grep  "Captured frame from MAC" | wc -l
0

You can see there were like 541 errors only for 227 node.

The text was updated successfully, but these errors were encountered:

xtroncode · 2020-12-14T03:38:10Z

Hi @alok87 , We are facing a similar issue. Were you able to find anything around it ?

alok87 · 2020-12-14T05:59:16Z

No

nicolasdonoso · 2021-06-08T19:33:44Z

Having similar issues using this image weaveworks/weave-kube:2.8.1 and k8s 1.19.9

pulberg · 2021-08-25T23:16:28Z

Same issues, weave 2.8.1, k8s 1.20.9

nyxi · 2021-09-28T07:24:17Z

Also seeing the same error message with Weave 2.8.1, we are not seeing this behavior in our clusters still running Weave 2.7.0

EDIT: I believe our problems were mostly because we upgraded to Weave 2.8 without using the new DaemonSet that was introduced. So we were using the DaemonSet for v2.7 with the 2.8 image of Weave.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cause of "Captured frame from MAC ... associated with another peer" #3778

Cause of "Captured frame from MAC ... associated with another peer" #3778

alok87 commented Feb 27, 2020 •

edited

Loading

xtroncode commented Dec 14, 2020

alok87 commented Dec 14, 2020

nicolasdonoso commented Jun 8, 2021

pulberg commented Aug 25, 2021

nyxi commented Sep 28, 2021 •

edited

Loading

Cause of "Captured frame from MAC ... associated with another peer" #3778

Cause of "Captured frame from MAC ... associated with another peer" #3778

Comments

alok87 commented Feb 27, 2020 • edited Loading

What you expected to happen?

What happened?

How to reproduce it?

Anything else we need to know?

Versions:

MTU Setting

Logs:

xtroncode commented Dec 14, 2020

alok87 commented Dec 14, 2020

nicolasdonoso commented Jun 8, 2021

pulberg commented Aug 25, 2021

nyxi commented Sep 28, 2021 • edited Loading

alok87 commented Feb 27, 2020 •

edited

Loading

nyxi commented Sep 28, 2021 •

edited

Loading