-
Notifications
You must be signed in to change notification settings - Fork 670
networking broken for containers with moved MAC #2436
Comments
Hmm. Curiously making that suggested changed didn't fix the problem for me. Something weird is going on... |
There's a bug in the MAC cache: I was invoking the same Curiously fixing that bug still does not make things work. Even though the logs clearly indicate that both peers detect the moved MAC. |
It does when I run with |
when running with fastdp, after pinging from B to C
(i.e. the opposite direction of the failing ping), then subsequently the ping from C to B starts working. tcpdumping 'weave' on both machines shows that during the first C-to-B ping the ARP request gets through to B and a reply is sent, but that never appears on C. |
What is the probability to get same mac address on different host? Just wondering because we hit this bug pretty easily. |
It's 46 random bits; so pretty low.
I filed this bug separately from #2433 because I have no evidence to suggest that the symptoms there are explained by this. |
I believe I have determined what is happening here:
This is why if you If you apply this patch
on top of @rade's #2437 branch, the second ping succeeds immediately. |
Hi, We recently had a very network connectivity in Kubernetes bug possibly involving weave. This message is printed on a regular basis even when the network is working so I am not sure it is related at all but I would like to know if this message is a sign that we might run into the problem stated in this bug or if it is in fact usually a benign error. |
@yannrouillard see #2877 - it's not always an error. |
We have this issue a lot. It runs ok for sometime and when the pods move around there is a risk that this happens. We are running docker.io/weaveworks/weave-kube:2.4.1 on aws with 5 nodes. Will this issue ever be prioritized? Our solution is today to delete the weave db and delete the weave pods. Some times this works the first time.
|
@Nossnevs you are explicitly creating a container with the same MAC on a different node? If this is not your situation, please open a new issue and supply the requested information. |
when a MAC previously associated with a container on one host, is subsequently associated with container on a different host, the latter container has no network connectivity.
docker logs weave
on host2 showsRunning
weave report
confirms that there is a corresponding MAC cache entry.@bboreham was pondering whether we can make a better decision by inspecting the destination mac, i.e. only drop the mac if the source mac appears to be from a different peer and the destination mac is for a local peer.
Which, by my reading of the code, simply means turning the error into a warning and carrying on.
We'd need to think carefully in what situations that might create a loop.
The text was updated successfully, but these errors were encountered: