Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Intermittent unknownhost issues with errors in weave logs #3757

Closed
dstrimble opened this issue Jan 21, 2020 · 7 comments
Closed

Intermittent unknownhost issues with errors in weave logs #3757

dstrimble opened this issue Jan 21, 2020 · 7 comments

Comments

@dstrimble
Copy link

What you expected to happen?

Expected for unknownhostexceptions to clear up in weave

What happened?

Intermittent unknownhost exceptions across apps in the kubernetes cluster.
Rampant message about connection shutdowns in weave pod logs pointing to weave network issues.

INFO: 2020/01/21 10:45:47.224212 ->[172.20.33.113:6783|da:a4:78:fc:54:c6(nodew00487.nonprod.jbhunt.com)]: connection shutting down due to error: IP allocation was seeded by different peers (received: [16:c0:ca:ad:e4:62 1e:75:c6:8c:ea:73 7a:d3:c8:59:b6:f1], ours: [22:03:ad:09:70:b5(jvtk8i00402.nonprod.jbhunt.com) 7e:a3:30:e9:2a:cc(nodei00404.nonprod.company.com) a2:9f:2f:6e:e1:86(nodei00403.nonprod.company.com) a2:9f:d5:6e:9c:c3(nodew00401.nonprod.company.com)])


...

connection shutting down due to error: Multiple connections to da:a4:78:fc:54:c6(nodew00487.nonprod.company.com) added to 16:84:46:93:53:9e(nodem00402.nonprod.company.com)

Anything else we need to know?

Azure VMs

Versions:

$ weave version
weave 2.5.2
$ docker version
Client:
 Version:      17.03.2-ce
 API version:  1.27
 Go version:   go1.7.5
 Git commit:   f5ec1e2
 Built:        Tue Jun 27 03:35:14 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.03.2-ce
 API version:  1.27 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   f5ec1e2
 Built:        Tue Jun 27 03:35:14 2017
 OS/Arch:      linux/amd64
 Experimental: false

$ uname -a
Linux node00402 4.15.0-1066-azure #71-Ubuntu SMP Thu Dec 12 20:35:32 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.11", GitCommit:"637c7e288581ee40ab4ca210618a89a555b6e7e9", GitTreeState:"clean", BuildDate:"2018-11-26T14:38:32Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.11", GitCommit:"637c7e288581ee40ab4ca210618a89a555b6e7e9", GitTreeState:"clean", BuildDate:"2018-11-26T14:25:46Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Logs:

$
$ kubectl logs -n kube-system <weave-net-pod> weave
INFO: 2020/01/21 10:45:47.224212 ->[172.20.33.113:6783|da:a4:78:fc:54:c6(nodew00487.nonprod.jbhunt.com)]: connection shutting down due to error: IP allocation was seeded by different peers (received: [16:c0:ca:ad:e4:62 1e:75:c6:8c:ea:73 7a:d3:c8:59:b6:f1], ours: [22:03:ad:09:70:b5(jvtk8i00402.nonprod.jbhunt.com) 7e:a3:30:e9:2a:cc(nodei00404.nonprod.company.com) a2:9f:2f:6e:e1:86(nodei00403.nonprod.company.com) a2:9f:d5:6e:9c:c3(nodew00401.nonprod.company.com)])


...

connection shutting down due to error: Multiple connections to da:a4:78:fc:54:c6(nodew00487.nonprod.company.com) added to 16:84:46:93:53:9e(nodem00402.nonprod.company.com)

@bboreham
Copy link
Contributor

IP allocation was seeded by different peers

See https://www.weave.works/docs/net/latest/tasks/ipam/troubleshooting-ipam/#seeded-by-different-peers

connection shutting down due to error: Multiple connections to da:a4:78:fc:54:c6(nodew00487.nonprod.company.com) added to 16:84:46:93:53:9e(nodem00402.nonprod.company.com)

This is not a problem, it's just something that can happen due to timing when many connections are being added and removed, which is caused by the fatal condition above.

@dstrimble
Copy link
Author

I've stopped the IP allocation errors but still seeing closed connections and intermittent unknown host exceptions in apps.

@bboreham
Copy link
Contributor

When we ask for "logs" we mean the whole thing.

@jfkilpat
Copy link

jfkilpat commented Jan 21, 2020

I work with @dstrimble, here are 30 minutes of logs.

query_data.txt

@saada
Copy link

saada commented Jan 21, 2020

Every line in that file shows: Discovered remote MAC logs. Do you have any other logs that show the errors mentioned above?

@bboreham
Copy link
Contributor

bboreham commented Jan 21, 2020

Is this a joke?

You seem to have supplied the logs from many different pods all mushed together with nothing to distinguish them.

We ask for the logs from one pod.
Generally if something is going to go wrong it will be towards the top of the log.

@dstrimble
Copy link
Author

That's current, the /var/lib/weave deletion on about half of the nodes cleared up the weave errors. Still getting unknown host exceptions

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants