Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Weave-Net Addon causing kernel panics on RPI 3B+. #3314

Closed
12wrigja opened this issue Jun 11, 2018 · 25 comments
Closed

Weave-Net Addon causing kernel panics on RPI 3B+. #3314

12wrigja opened this issue Jun 11, 2018 · 25 comments
Milestone

Comments

@12wrigja
Copy link

What you expected to happen?

What I expected to happen: upon two weave pods discovering each other, weave to start working.

What happened?

The weave pod seems to execute some command that causes one of the two nodes connecting to each other to crash with a kernel panic. I'm guessing it's unlikely that weave itself is the root cause here, but I figure here is a good place to start.

Logs from the kernel are at the end of this issue.

How to reproduce it?

Setup kubernetes 1.9.7 on two nodes, and apply the K8s Weave addon using
$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" - this seems to install version 2.3.0 according to the pod descriptions from the API.
Wait a short amount of time for the pods to try and connect to each other.
Notice that one of the machines has rebooted, and the other is unable to connect to the first (as it has crashed).

Anything else we need to know?

Probably the most important part: both nodes in this case are running the latest version of Raspbian, as they are Raspberry Pi 3B+ machines. They are all located on a home network, with IPs 192.168.0.3-5, statically assigned. These are configured using Ansible to an extent, and I might be able to share the scripts used if needed.

Versions:

$ weave version (found by exec-ing to a running pod awaiting connections)
/home/weave # ./weave --local version
weave 2.3.0

$ docker version
Docker version 18.04.0-ce, build 3d479c0

$ uname -a
Linux m1 4.14.34-v7+ #1110 SMP Mon Apr 16 15:18:51 BST 2018 armv7l GNU/Linux

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T10:09:24Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.7", GitCommit:"dd5e1a2978fd0b97d9b78e1564398aeea7e7fe92", GitTreeState:"clean", BuildDate:"2018-04-18T23:58:35Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/arm"}

Logs:

Before one node connects to the other, everything looks mostly fine. The initial connections are attempted to each of the three peers - .3, .4, and .5.

$ kubectl logs -n kube-system <weave-net-pod> weave
DEBU: 2018/06/11 05:43:56.878318 [kube-peers] Checking peer "ee:bf:2b:a9:06:ad" against list &{[{42:fc:dc:59:ea:96 m1} {3e:39:75:92:f1:9b m3} {02:f9:0c:1b:52:04 m3} {ee:bf:2b:a9:06:ad m1}]}
INFO: 2018/06/11 05:43:56.993279 Command line options: map[ipalloc-init:consensus=3 ipalloc-range:10.32.0.0/12 name:ee:bf:2b:a9:06:ad nickname:m1 datapath:datapath db-prefix:/weavedb/weave-net docker-api: host-root:/host http-addr:127.0.0.1:6784 metrics-addr:0.0.0.0:6782 no-dns:true port:6783 conn-limit:100 expect-npc:true]
INFO: 2018/06/11 05:43:56.993509 weave  2.3.0
INFO: 2018/06/11 05:43:57.589147 Bridge type is bridged_fastdp
INFO: 2018/06/11 05:43:57.589233 Communication between peers is unencrypted.
INFO: 2018/06/11 05:43:57.625030 Our name is ee:bf:2b:a9:06:ad(m1)
INFO: 2018/06/11 05:43:57.625248 Launch detected - using supplied peer list: [192.168.0.3 192.168.0.4 192.168.0.5]
INFO: 2018/06/11 05:43:57.628632 Checking for pre-existing addresses on weave bridge
INFO: 2018/06/11 05:43:57.645824 [allocator ee:bf:2b:a9:06:ad] Initialising with persisted data
INFO: 2018/06/11 05:43:57.651091 Sniffing traffic on datapath (via ODP)
INFO: 2018/06/11 05:43:57.662415 ->[192.168.0.3:6783] attempting connection
INFO: 2018/06/11 05:43:57.674431 ->[192.168.0.4:6783] attempting connection
INFO: 2018/06/11 05:43:57.674875 ->[192.168.0.5:6783] attempting connection
INFO: 2018/06/11 05:43:57.675066 ->[192.168.0.3:57939] connection accepted
INFO: 2018/06/11 05:43:57.675843 ->[192.168.0.4:6783] error during connection attempt: dial tcp4 :0->192.168.0.4:6783: connect: connection refused
INFO: 2018/06/11 05:43:57.676161 ->[192.168.0.5:6783] error during connection attempt: dial tcp4 :0->192.168.0.5:6783: connect: connection refused
INFO: 2018/06/11 05:43:57.679452 ->[192.168.0.3:57939|ee:bf:2b:a9:06:ad(m1)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/06/11 05:43:57.680866 ->[192.168.0.3:6783|ee:bf:2b:a9:06:ad(m1)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/06/11 05:43:57.699266 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2018/06/11 05:43:57.700143 Listening for metrics requests on 0.0.0.0:6782
INFO: 2018/06/11 05:43:58.690696 [kube-peers] Added myself to peer list &{[{42:fc:dc:59:ea:96 m1} {3e:39:75:92:f1:9b m3} {02:f9:0c:1b:52:04 m3} {ee:bf:2b:a9:06:ad m1}]}
DEBU: 2018/06/11 05:43:58.703575 [kube-peers] Nodes that have disappeared: map[]
INFO: 2018/06/11 05:43:59.078850 ->[192.168.0.4:6783] attempting connection
INFO: 2018/06/11 05:43:59.080194 ->[192.168.0.4:6783] error during connection attempt: dial tcp4 :0->192.168.0.4:6783: connect: connection refused
INFO: 2018/06/11 05:43:59.315638 ->[192.168.0.5:6783] attempting connection
INFO: 2018/06/11 05:43:59.316856 ->[192.168.0.5:6783] error during connection attempt: dial tcp4 :0->192.168.0.5:6783: connect: connection refused
INFO: 2018/06/11 05:44:02.806629 ->[192.168.0.5:6783] attempting connection
INFO: 2018/06/11 05:44:02.808304 ->[192.168.0.5:6783] error during connection attempt: dial tcp4 :0->192.168.0.5:6783: connect: connection refused
INFO: 2018/06/11 05:44:03.554820 ->[192.168.0.4:6783] attempting connection

Once one pod connects to another, it's random as to which one crashes (but one always does). All the "logs" of the kernel panic that I could get hold of are here:

Jun 11 05:28:31 m1 kubelet[785]: I0611 05:28:31.922238     785 kubelet.go:2118] Container runtime status: Runtime Cond
itions: RuntimeReady=true reason: message:, NetworkReady=true reason: message:
Jun 11 05:28:32 m1 kernel: [ 2162.741648] Unable to handle kernel NULL pointer dereference at virtual address 00000000
Jun 11 05:28:32 m1 kernel: [ 2162.744896] pgd = 921c0000
Jun 11 05:28:32 m1 kernel: [ 2162.747989] [00000000] *pgd=19c2c835, *pte=00000000, *ppte=00000000

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.751060] Internal error: Oops: 80000007 [#1] SMP ARM
Jun 11 05:28:32 m1 kernel: [ 2162.751060] Internal error: Oops: 80000007 [#1] SMP ARM
Jun 11 05:28:32 m1 kernel: [ 2162.754243] Modules linked in: xt_NFLOG veth dummy vport_vxlan vxlan ip6_udp_tunnel udp_
tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_defrag_ipv6 nfnetlink_log xt_statistic xt_nat xt_recent ipt_REJECT
 nf_reject_ipv4 xt_tcpudp ip_set_hash_ip xt_set ip_set xt_comment xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_con
ntrack_netlink nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntr
ack nf_nat nf_conntrack br_netfilter bridge stp llc overlay cmac bnep hci_uart btbcm serdev bluetooth ecdh_generic evd
ev joydev sg brcmfmac brcmutil cfg80211 rfkill snd_bcm2835(C) snd_pcm snd_timer snd uio_pdrv_genirq fixed uio ip_table
s x_tables ipv6

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.816788] Process weaver (pid: 1896, stack limit = 0x92190210)

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.820871] Stack: (0x921919f0 to 0x92192000)

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.824936] 19e0:                                     00000000 00000000 0500a8c0 92191a88

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.829076] 1a00: 0000801a 00008bad b88a5bd0 b88a5b98 92191d2c 7f637ad0 00000001 92191a5c

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.833081] 1a20: 23d23b00 00000000 b88a5bd0 99ed4000 00000050 b406e000 00000000 99ed4050

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.837243] 1a40: 00000000 00008bad 00000040 0000801a 92191a68 00002100 00000000 00000000

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.841436] 1a60: 00008000 0000ee47 00000002 0500a8c0 00000000 00000000 00000000 00000000

Message from syslogd@m1 at Jun 11 05:28:32 ...
 kernel:[ 2162.845629] 1a80: 00000000 00000000 0300a8c0 00000000 00000000 00000000 00000000 00000000
Shared connection to m1 closed.
@brb
Copy link
Contributor

brb commented Jun 11, 2018

@12wrigja Thanks for the issue.

In the provided syslog logs of the oops there is no content of the general purpose registers. Could you check whether it is in dmesg? I'm interested in pc to trace down a function which caused the panic.

@brb
Copy link
Contributor

brb commented Jun 11, 2018

Although I don't have evidence, but the kernel version rings a bell: coreos/bugs#2382.

@brb
Copy link
Contributor

brb commented Jun 11, 2018

Just checked, and indeed, the latest raspbian is prone to the kernel bug mentioned in my comment above, as https://github.com/raspberrypi/linux/tree/raspberrypi-kernel_1.20180417-1 misses the fix: torvalds/linux@f15ca72#diff-4f541554c5f8f378effc907c8f0c9115.

To workaround, you can disable fastdp by re-deploying your cluster with kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')&env.WEAVE_NO_FASTDP=1". You might need to remove the "weave" interface if it exists before the re-deploy.

@brb
Copy link
Contributor

brb commented Jun 11, 2018

@12wrigja OT: Did you compile Weave Net yourself? Many are struggling with running it on RPI 3 B+ due to invalid image arch: #3276.

@arnulfojr
Copy link

arnulfojr commented Jun 11, 2018

I'm also experiencing this issue, I've been actually trying to set it up without any success. Sadly no logs from Kubernetes and most of the times my pi (3B+) reboots.

uname -a
Linux black-pearl 4.14.34-hypriotos-v7+ #1 SMP Sun Apr 22 14:57:31 UTC 2018 armv7l GNU/Linux
$ sudo docker version
Client:
 Version:       18.04.0-ce
 API version:   1.37
 Go version:    go1.9.4
 Git commit:    3d479c0
 Built: Tue Apr 10 18:25:24 2018
 OS/Arch:       linux/arm
 Experimental:  false
 Orchestrator:  swarm

Server:
 Engine:
  Version:      18.04.0-ce
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.9.4
  Git commit:   3d479c0
  Built:        Tue Apr 10 18:21:25 2018
  OS/Arch:      linux/arm
  Experimental: false
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:10:24Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/arm"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/arm"}

@brb
Copy link
Contributor

brb commented Jun 11, 2018

@arnulfojr Have you tried following the workaround suggested above?

@12wrigja
Copy link
Author

12wrigja commented Jun 11, 2018 via email

@12wrigja
Copy link
Author

Following up here:

  • using docker inspect weaveworks:weave-kube:2.3.0, I get apparently an amd64 image, that seems to run just fine on my rpi 3B+. Is the output from inspect wrong? See https://gist.github.com/12wrigja/19b8074a00a565dc089941a4a28d744f for the results of inspecting.
  • disabling fastdp seems to have fixed my issues. I then noticed that my Pod and Service CIDR's were overlapping, changed Weave to use the right CIDR, and now I'm good to go.

I subscribed to the issue you created for the rpi kernel, and would be happy to test any fixes they release.

@brb
Copy link
Contributor

brb commented Jun 12, 2018

@12wrigja Thanks for the follow up.

Is the output from inspect wrong?

Interesting. The Architecture field is set to amd64 and when I inspect the image with manifest-tool I cannot find the image you inspected among all returned. I get the same inspect output as you got when running on a Scaleway armv7 VM.

@arnulfojr
Copy link

arnulfojr commented Jun 12, 2018

First of, I'm glad I'm also getting the same error than @12wrigja as debugging this is close to impossible (at least for me) and is a bit frustrating 😅 . I have inspected the image and indeed I get a amd64 arquitectura one.

https://gist.github.com/arnulfojr/a32a33a42dc7e8254e1abdbb5e3873df

The kernel panic error and the amd64 image are present in all my pi's (2-rev1, 3, 3B+).
I'll try deactivating fastdp

@brb
Copy link
Contributor

brb commented Jun 12, 2018

@arnulfojr Could you answer to my question above?

@arnulfojr
Copy link

arnulfojr commented Jun 12, 2018

@brb So indeed, is the fastdp, sorry I started over from scratch and took me a while to flash the SD cards.

So far, deactivating it avoids the kernel panic. I'm using weave as a plugin with Kubernetes and finally they all came to life! for the mean time, it would be cool if you can post officially somewhere about the kernel issue, so people having the issue can easily track it down. Furthermore, I still have the amd64 image when kube pulls the weave-kube:2.3.0 and weave-npc:2.3.0 image tho.

I still have to test it with usage but so far my issue went away by deactivation the fastdp. From an expert to a newbie, what's the drawback of deactivating it?

Thanks a lot for the issue entries to the Pi kernel repo!

$ kubectl get nodes
NAME              STATUS    ROLES     AGE       VERSION
black-pearl       Ready     master    10m       v1.9.8
flying-dutchman   Ready     <none>    5m        v1.9.8
wicked-wench      Ready     <none>    1m        v1.9.8

@brb
Copy link
Contributor

brb commented Jun 13, 2018

@arnulfojr

Thanks.

what's the drawback of deactivating it?

The non-fastdp mode which also known as sleeve is slower and consumes more CPU cycles. Please see for more details: https://www.weave.works/blog/weave-docker-networking-performance-fast-data-path/

it would be cool if you can post officially somewhere about the kernel issue

Agreed. We are going to update our docs to include known issues.

@bernhara
Copy link

I'm still trying to help identifying the source of the problem.
And I'm fighting with the "fastdp" mode: I understood that a "weave reset --force" is mandatory to make "weavenet" run again, even on X86.

During the tests, I got a kernel crash log on my PI (notice that I have a 32bits kernel running):

Message from syslogd@s-pituum-01 at Jun 12 18:18:43 ...
kernel:[ 466.052644] Internal error: Oops: 80000007 [#1] SMP ARM

For info:

s-pituum-01% uname -a
Linux s-pituum-01 4.14.34-v7+ #1110 SMP Mon Apr 16 15:18:51 BST 2018 armv7l GNU/Linux

@bboreham
Copy link
Contributor

I understood that a "weave reset --force" is mandatory to make "weavenet" run again

I may be missing some context: this command is necessary if Weave Net is not running and you want to tear down all the supporting structures. For instance if you want to move from fastdp to sleeve.

In many cases a reboot is an easier option, and will make everything go away except the IP allocation data.

@bernhara
Copy link

I don't have a clear understanding of why this is necessary, but the "weave net" container crashed during launch when switching from "fastdp" to "non fastdb" (even on X86 where it also ran without any problem).
I guess that the reset removed the virtual network bridges so that the next launch recreates the with with the correct characteristics.

@bernhara
Copy link

An update about my latest tests.

I launched the weave container with the "fastdp"" disabled:

$ WEAVE_NO_FASTDP=1 ./WEAVE/weave launch --no-restart

And the container is now running for about a whole day:

s-pituum-01% docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6db76666a184 weaveworks/weave:2.3.0 "/home/weave/weaver …" 18 hours ago Up 18 hours weave

With "fastdp"activated, the container generates the kernel crash we are discussing here after about one minute.

In fact, this is a satisfactory workaround for me.

@brb
Copy link
Contributor

brb commented Jun 28, 2018

According to raspberrypi/linux#2580 (comment), the issue is going to be fixed in the next Raspbian release or it has been already fixed in raspberrypi-kernel package.

Could someone try updating the package and running Weave Net with fastdp? Thanks.

@12wrigja
Copy link
Author

Martynas,

I updated my raspberry pi's to use the latest kernel provided by running rpi-update (4.14.52-v7+ in my case), adjusted my weave daemonset to re-enable fastdp, and restarted all my nodes and everything works fine - weave-kube reports it's using bridged_fastdp and none of my nodes crash.

@bboreham
Copy link
Contributor

Thanks @12wrigja; I'll close this issue based on your report.

@brb brb added this to the n/a milestone Jun 29, 2018
@bernhara
Copy link

bernhara commented Jul 6, 2018

One more confirmation: I upgrade my Raspbian with a standard "apt upgrade" which upgraded my kernel.
Everything works now as expected.

@brb
Copy link
Contributor

brb commented Jul 9, 2018

@bernhara Thanks for letting us know.

@press5
Copy link

press5 commented Dec 16, 2018

EDIT: I see it's been fixed upstream already. Ignore me.

I have this problem too. Would a full dump of the OOPS help you?

Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:31:33Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/arm"}

Server: Docker Engine - Community
Engine:
Version: 18.09.0
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:17:57 2018
OS/Arch: linux/arm
Experimental: false

Linux k3 4.14.34-hypriotos-v7+ #1 SMP Sun Apr 22 14:57:31 UTC 2018 armv7l GNU/Linux

Internal error: Oops: 80000007 [#1] SMP ARM
Process weaver (pid: 17069, stack limit = 0xb6666210)
Stack: (0xb66679f0 to 0xb6668000)
79e0:                                     00000000 00000000 5558a8c0 b6667a90
7a00: 0000801a 0000cacc 9284c450 9284c418 b6667a5c b6667a20 80c7b140 a3e03500
7a20: 80c7b140 00000000 b6667a64 00000000 00000000 a5f59850 00000000 0000cacc
7a40: 92a9c000 bcda8500 00002100 a5f59800 00000050 0000801a f2fcfc00 00000000
7a60: 00000000 00000000 00008000 0000ee47 00000002 5558a8c0 00000000 00000000
7a80: 00000000 00000000 00000000 00000000 5658a8c0 00000000 00000000 00000000
7aa0: 00000000 00000000 00000000 9284cb40 bcda8000 00002000 bcda8000 0000056e
7ac0: a5f59800 a3e85500 b6667b54 b6667ad8 7f79032c 7f78ecf4 00000000 00000000
7ae0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7b00: 00000000 00000000 00000000 8067af3c 00000040 401d5809 00000040 9284cb40
7b20: 00000000 bcda8000 b6667b64 9284cb40 00000001 bcda8000 bcda8000 0000056e
7b40: 00000000 a3e85500 b6667b9c b6667b58 8067b4a4 7f78ff34 b6667b9c b6667b68
7b60: 8067b0ec b6667bb0 80c04e84 00000000 00000000 80b8c578 00000001 9284cb40
7b80: bcda8000 a3e85500 00000008 9284cb40 b6667bf4 b6667ba0 8067be44 8067b410
7ba0: 00000000 00000000 00000000 00000000 fffffff4 00000000 00000000 00000000
7bc0: 00000000 00000000 00000000 9284cb40 93604140 0000ffff 93604140 928a72c0
7be0: 00000008 9284cb40 b6667c04 b6667bf8 8067bfcc 8067b744 b6667c24 b6667c08
7c00: 7f763bd8 8067bfbc 00000000 9284cb40 9316c4a8 00000000 b6667c54 b6667c28
7c20: 7f75632c 7f763afc b6667c54 b6667c38 7f75db18 00000002 a3e0359c 9284cb40
7c40: 00000001 9316c4a8 b6667cdc b6667c58 7f75774c 7f7562d8 00000001 00000400
7c60: 00000000 00000001 00000000 80273138 80c7b140 00000001 00000000 9316c4a8
7c80: 00000001 a3e03590 004000c0 928a72c0 80665534 00000040 9316c4a8 b6667d24
7ca0: 80c7b140 00000001 a5f5b014 a5f5b590 9316c4a8 80b8d820 928a72c0 9284cb40
7cc0: 00000001 9316c4a8 a5f5b014 00000000 b6667d0c b6667ce0 7f757ac4 7f7563e8
7ce0: 00000014 b6667cf0 928a72c0 9284cb40 9316c470 a3e03580 9316c4a8 a5f5b014
7d00: b6667d54 b6667d10 7f75929c 7f757a70 00000001 00000044 7f768294 00000001
7d20: b6667d54 a3e03580 804b5018 7f76827c 7f76c050 a5f5b000 9284c540 80c7b140
7d40: 93604a00 b6667dc0 b6667dbc b6667d58 806b85f8 7f7590cc 7f768294 b6667dc0
7d60: a5f5b010 00000018 00000025 000042a6 a5f5b000 a5f5b010 a5f5b014 93604a00
7d80: 80c7b140 00000000 00000000 b6667dc0 807878e8 9284c540 a5f5b000 806b83d0
7da0: 000005d4 00000000 00000000 9284c540 b6667dfc b6667dc0 806b7500 806b83dc
7dc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7de0: 807878e8 80c7e9b8 9284c540 9284c540 b6667e14 b6667e00 806b83bc 806b7430
7e00: bce3e000 a5e47400 b6667e44 b6667e18 806b6c70 806b8398 b6667ea4 7fffffff
7e20: a5e47400 b6667ed0 a5e47400 000005d4 00000000 00000000 b6667ea4 b6667e48
7e40: 806b70c4 806b6b00 014000c0 000154de 154de000 802a6bd8 92be5400 00000008
7e60: b6667ec4 a3cd3180 00000000 000042a6 00000000 00000000 00000000 00000000
7e80: 99970540 1501d56c 0000000c 00000000 b6666000 00000000 b6667eb4 b6667ea8
7ea0: 8065a438 806b6e18 b6667fa4 b6667eb8 8065b570 8065a420 b6667ed8 3da2c000
7ec0: 00000000 00000001 154de000 000005d4 b6667f00 0000000c 00000001 00000000
7ee0: 00000000 b6667ed0 00000000 15384ee0 00000000 00000000 00000000 8078ae20
7f00: 00000010 00000000 00000000 b6667f18 80784b8c 80144944 b6667f44 b6667f28
7f20: 801482f4 80122ec8 b6667f44 b6667f38 80122ec8 807851e0 3da2c000 80b8cd40
7f40: 80149680 00000000 00000002 b6666000 b6666000 00000000 b6667fb0 00000000
7f60: b6666000 15384ee0 00000002 b6666000 00000000 b6667fb0 00031498 200d0010
7f80: ffffffff 1501d56c 0000000c 010156d8 00000122 80107f64 00000000 b6667fa8
7fa0: 80107dc0 8065b4b8 1501d56c 0000000c 00000006 154de000 000005d4 00000000
7fc0: 1501d56c 0000000c 010156d8 00000122 1501c000 000000ab 15384ee0 01c36a68
7fe0: 00000000 1514c5a8 00011418 0008bfe8 600d0010 00000006 00000000 00000000
Code: bad PC value

@bboreham
Copy link
Contributor

@press5 we closed this issue based on evidence that the kernel bug is fixed in later versions. Since you report the same kernel version I don't see any reason to look further.

A workaround is described at #3314 (comment)

@aallbrig
Copy link

aallbrig commented Dec 29, 2019

Screenshot 2019-12-29 14 02 04

Posting to confirm that running rpi-update and then restarting my raspberry pi allows weave net pods to run without a hitch. Thanks to everyone's collaboration 👍

Just to add my experience... I'm following this medium guide to set up a k8s pi cluster to set up a 5 node k8s pi cluster, and discovered that weave net pods are problematic if you strictly follow this article's commands ( it happens 😏 ).

Here are the pi models in my cluster:

* Raspberry Pi 4 Model B Rev 1.1 2GB
* Raspberry Pi 4 Model B Rev 1.1 1GB
* Raspberry Pi 3 Model B Plus Rev 1.3 1GB
* Raspberry Pi 3 Model B Rev 1.2 1GB
* Raspberry Pi Model B Plus Rev 1.2 512MB

(command to acquire this info: cat /proc/device-tree/model && echo && free -m)

I decided to make the Raspberry Pi 3 Model B Plus Rev 1.3 my k8s master node and saw that weave net entered into a k8s crash loop after . Running rpi-update and then sudo reboot afterwards allowed pi 3 B+ to run weave net (yeet).

Cheers all!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants