Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master issue for tracking flannel vxlan failures #2013

Closed
brandond opened this issue Jul 11, 2020 · 0 comments
Closed

Master issue for tracking flannel vxlan failures #2013

brandond opened this issue Jul 11, 2020 · 0 comments

Comments

@brandond
Copy link
Member

Environmental Info:
K3s Version:
k3s 1.17 and up

Node(s) CPU architecture, OS, and Version:
Linux kernels < 5.7 (I do not believe the patch has been backported by any distros yet).

Cluster Configuration:
Any cluster with pods on more than one node.

Describe the bug:
Strange errors (timeouts, etc) in communications between pods and to services. Specifically, timeouts are likely to occur when communicating between pods on different nodes, or when accessing a service via entrypoint on one node to a pod on a different node.

Steps To Reproduce:

  • Install k3s server with default flannel vxlan backend
  • Install k3s agent
  • Create pod on server node
  • Create pod on agent node
  • Attempt to communicate between pods
  • Note timeouts or errors

Expected behavior:
No errors or timeouts

Actual behavior:

There is a bug in the kernel netfilter code that caused UDP checksums to be miscalculated under certain circumstances. Kubernetes' iptables rules combined with vxlan's use of UDP encapsulation triggered this bug, causing vxlan packets between nodes to be dropped on the receiver due to invalid checksums.

Additional context / logs:
Kernel patch: torvalds/linux@ea64d8d
Kubernetes patch to avoid triggering kernel bug: kubernetes/kubernetes#92035
Flannel workaround (declined in favor of kubernetes patch): flannel-io/flannel#1282

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant