Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

weave-net failing to sleeve on EKS #3781

Closed
ac-hibbert opened this issue Mar 10, 2020 · 1 comment · Fixed by #3783
Closed

weave-net failing to sleeve on EKS #3781

ac-hibbert opened this issue Mar 10, 2020 · 1 comment · Fixed by #3783
Labels

Comments

@ac-hibbert
Copy link

ac-hibbert commented Mar 10, 2020

What you expected to happen?

Weave works in fastdp mode

What happened?

Weave failing to sleeve mode

How to reproduce it?

Seems to immediately switch to sleeve mode

Anything else we need to know?

  • EKS
  • us-east-1
  • 2 subnets in 2 AZs
  • 4 nodes
  • r5.xlarge

Versions:

# /home/weave/weave --local version
weave 2.6.1
$ docker version
# uname -a
Linux ip-10-207-56-160.ec2.internal 4.14.171-136.231.amzn2.x86_64 #1 SMP Thu Feb 27 20:22:48 UTC 2020 x86_64 Linux
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-23T14:21:36Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.9-eks-502bfb", GitCommit:"502bfb383169b124d87848f89e17a04b9fc1f6f0", GitTreeState:"clean", BuildDate:"2020-02-07T01:31:02Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

Logs:

$ kubectl logs -n kube-system weave-net-pnv9s -c weave
		DEBU: 2020/03/10 22:51:36.008130 [kube-peers] Checking peer "66:df:31:d5:0e:9b" against list &{[{56:49:51:13:23:25 ip-10-207-56-224.ec2.internal} {8a:a4:a4:85:cb:6d ip-10-207-56-243.ec2.internal} {66:df:31:d5:0e:9b ip-10-207-56-160.ec2.internal} {62:77:0a:f2:8e:69 ip-10-207-56-144.ec2.internal}]}
INFO: 2020/03/10 22:51:36.060099 Command line options: map[conn-limit:200 datapath:datapath db-prefix:/weavedb/weave-net docker-api: expect-npc:true host-root:/host http-addr:127.0.0.1:6784 ipalloc-init:consensus=3 ipalloc-range:100.64.0.0/10 metrics-addr:0.0.0.0:6782 mtu:1376 name:66:df:31:d5:0e:9b nickname:ip-10-207-56-160.ec2.internal no-dns:true port:6783]
INFO: 2020/03/10 22:51:36.060129 weave  2.6.1
INFO: 2020/03/10 22:51:36.253396 Re-exposing 100.68.0.0/10 on bridge "weave"
INFO: 2020/03/10 22:51:36.277079 Bridge type is bridged_fastdp
INFO: 2020/03/10 22:51:36.277095 Communication between peers is unencrypted.
INFO: 2020/03/10 22:51:36.330882 Our name is 66:df:31:d5:0e:9b(ip-10-207-56-160.ec2.internal)
INFO: 2020/03/10 22:51:36.330925 Launch detected - using supplied peer list: [10.207.56.144 10.207.56.224 10.207.56.243]
INFO: 2020/03/10 22:51:36.346564 Checking for pre-existing addresses on weave bridge
INFO: 2020/03/10 22:51:36.346677 weave bridge has address 100.68.0.0/10
INFO: 2020/03/10 22:51:36.351519 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.351668 Found address 100.68.0.2/10 for ID _
INFO: 2020/03/10 22:51:36.351798 Found address 100.68.0.2/10 for ID _
INFO: 2020/03/10 22:51:36.351907 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.351998 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352087 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352174 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352259 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352339 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352425 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352508 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352620 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352705 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352792 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352877 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.352963 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.353051 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.353137 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.353221 Found address 100.68.0.1/10 for ID _
INFO: 2020/03/10 22:51:36.353331 Found address 100.68.0.3/10 for ID _
INFO: 2020/03/10 22:51:36.353447 Found address 100.68.0.3/10 for ID _
INFO: 2020/03/10 22:51:36.353534 Found address 100.68.0.3/10 for ID _
INFO: 2020/03/10 22:51:36.353718 Found address 100.68.0.4/10 for ID _
INFO: 2020/03/10 22:51:36.353813 Found address 100.68.0.4/10 for ID _
INFO: 2020/03/10 22:51:36.353899 Found address 100.68.0.4/10 for ID _
INFO: 2020/03/10 22:51:36.354138 [allocator 66:df:31:d5:0e:9b] Initialising with persisted data
INFO: 2020/03/10 22:51:36.354241 Sniffing traffic on datapath (via ODP)
INFO: 2020/03/10 22:51:36.360466 ->[10.207.56.243:6783] attempting connection
INFO: 2020/03/10 22:51:36.360571 ->[10.207.56.144:6783] attempting connection
INFO: 2020/03/10 22:51:36.360628 ->[10.207.56.224:6783] attempting connection
INFO: 2020/03/10 22:51:36.361538 ->[10.207.56.144:6783|62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)]: connection ready; using protocol version 2
INFO: 2020/03/10 22:51:36.361597 overlay_switch ->[62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)] using fastdp
INFO: 2020/03/10 22:51:36.361630 ->[10.207.56.144:6783|62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)]: connection added (new peer)
INFO: 2020/03/10 22:51:36.362526 fastdp ->[10.207.56.144:6784|62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)]: IPSec init SA remote
WARN: 2020/03/10 22:51:36.362554 fastdp ->[10.207.56.144:6784|62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)]: IPSec init SA remote failed: deserialize InitSARemote: empty msg
INFO: 2020/03/10 22:51:36.362722 overlay_switch ->[62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)] fastdp deserialize InitSARemote: empty msg
INFO: 2020/03/10 22:51:36.362735 overlay_switch ->[62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)] using sleeve
INFO: 2020/03/10 22:51:36.363398 ->[10.207.56.243:6783|8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)]: connection ready; using protocol version 2
INFO: 2020/03/10 22:51:36.363398 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2020/03/10 22:51:36.363456 overlay_switch ->[8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)] using fastdp
INFO: 2020/03/10 22:51:36.363476 ->[10.207.56.243:6783|8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)]: connection added (new peer)
INFO: 2020/03/10 22:51:36.363635 fastdp ->[10.207.56.243:6784|8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)]: IPSec init SA remote
WARN: 2020/03/10 22:51:36.363658 fastdp ->[10.207.56.243:6784|8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)]: IPSec init SA remote failed: deserialize InitSARemote: empty msg
INFO: 2020/03/10 22:51:36.363660 Listening for metrics requests on 0.0.0.0:6782
INFO: 2020/03/10 22:51:36.363991 overlay_switch ->[8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)] fastdp deserialize InitSARemote: empty msg
INFO: 2020/03/10 22:51:36.364003 overlay_switch ->[8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)] using sleeve
INFO: 2020/03/10 22:51:36.364140 ->[10.207.56.224:6783|56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)]: connection ready; using protocol version 2
INFO: 2020/03/10 22:51:36.364177 overlay_switch ->[56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)] using fastdp
INFO: 2020/03/10 22:51:36.364194 ->[10.207.56.224:6783|56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)]: connection added (new peer)
INFO: 2020/03/10 22:51:36.365314 fastdp ->[10.207.56.224:6784|56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)]: IPSec init SA remote
WARN: 2020/03/10 22:51:36.365347 fastdp ->[10.207.56.224:6784|56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)]: IPSec init SA remote failed: deserialize InitSARemote: empty msg
INFO: 2020/03/10 22:51:36.365417 overlay_switch ->[56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)] fastdp deserialize InitSARemote: empty msg
INFO: 2020/03/10 22:51:36.365545 overlay_switch ->[56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)] using sleeve
INFO: 2020/03/10 22:51:36.462200 ->[10.207.56.224:6783|56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)]: connection fully established
INFO: 2020/03/10 22:51:36.462418 ->[10.207.56.144:6783|62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)]: connection fully established
INFO: 2020/03/10 22:51:36.462519 ->[10.207.56.243:6783|8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)]: connection fully established
INFO: 2020/03/10 22:51:36.462665 sleeve ->[10.207.56.144:6783|62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)]: Effective MTU verified at 8939
INFO: 2020/03/10 22:51:36.463359 sleeve ->[10.207.56.243:6783|8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)]: Effective MTU verified at 8939
INFO: 2020/03/10 22:51:36.463529 sleeve ->[10.207.56.224:6783|56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)]: Effective MTU verified at 8939
INFO: 2020/03/10 22:51:37.068350 [kube-peers] Added myself to peer list &{[{56:49:51:13:23:25 ip-10-207-56-224.ec2.internal} {8a:a4:a4:85:cb:6d ip-10-207-56-243.ec2.internal} {66:df:31:d5:0e:9b ip-10-207-56-160.ec2.internal} {62:77:0a:f2:8e:69 ip-10-207-56-144.ec2.internal}]}
DEBU: 2020/03/10 22:51:37.079517 [kube-peers] Nodes that have disappeared: map[]
100.68.0.0
DEBU: 2020/03/10 22:51:37.148724 registering for updates for node delete events
INFO: 2020/03/10 22:51:38.906194 Discovered remote MAC 9e:0a:32:5b:2b:87 at 62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)
INFO: 2020/03/10 22:51:39.162080 Discovered remote MAC 8e:4f:03:51:da:69 at 8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)
INFO: 2020/03/10 22:51:39.217053 Discovered remote MAC d6:8b:82:d9:52:5f at 56:49:51:13:23:25(ip-10-207-56-224.ec2.internal)
INFO: 2020/03/10 22:51:39.925408 Discovered remote MAC e6:bd:bd:e7:72:23 at 8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal)
INFO: 2020/03/10 22:51:40.392799 Discovered remote MAC 7a:a0:dc:d9:08:11 at 62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)
INFO: 2020/03/10 22:52:06.210852 Discovered remote MAC 72:2f:ec:b2:15:80 at 62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)
INFO: 2020/03/10 22:52:08.926139 Discovered remote MAC 2a:59:10:8e:77:ab at 62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal)

Debug

Status:

# /home/weave/weave --local status

        Version: 2.6.1 (up to date; next check at 2020/03/11 06:10:37)

        Service: router
       Protocol: weave 1..2
           Name: 66:df:31:d5:0e:9b(ip-10-207-56-160.ec2.internal)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 3
    Connections: 3 (3 established)
          Peers: 4 (with 12 established connections)
 TrustedSubnets: none

        Service: ipam
         Status: ready
          Range: 100.64.0.0/10
  DefaultSubnet: 100.64.0.0/10

Connections status:

 # /home/weave/weave --local status connections
-> 10.207.56.144:6783    established sleeve 62:77:0a:f2:8e:69(ip-10-207-56-144.ec2.internal) mtu=8939
-> 10.207.56.243:6783    established sleeve 8a:a4:a4:85:cb:6d(ip-10-207-56-243.ec2.internal) mtu=8939
-> 10.207.56.224:6783    established sleeve 56:49:51:13:23:25(ip-10-207-56-224.ec2.internal) mtu=8939
@murali-reddy
Copy link
Contributor

murali-reddy commented Mar 11, 2020

thanks @Hibbert for reporting the issue.

I can confirm the reported issue on 2.6.1. It seems like a regression with 2.6.1. Switching to 2.6.0 i see fastdp getting used but not with 2.6.1. Will invetigate further.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants