-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli: enable jumbo frames for GCP VPCs #1146
Conversation
✅ Deploy Preview for constellation-docs canceled.
|
From a benchmarking perspective looks good and makes sense. |
A theoretical side-effect is package loss if the environment does no actually offer this MTU. This is a very controlled environment so I think the risk of someone changing the MTU on our VPCs is negligible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not try this end to end but the change looks good to me.
For more information, see here: https://cloud.google.com/vpc/docs/mtu For TCP internet traffic, Google Cloud performs MSS clamping, so TCP should behave fine. Usually the Linux UDP stack does Path MTU Discovery to avoid this... but you never know. I am also not super experienced with UDP deployments. What I could see is that some badly programmed service running in our cluster could misbehave if it takes the MTU from a local network adapter and just spews out large UDP packets - without responding to PMTU and adjusting them in size. That could break things. Unfortunately I have no idea what good UDP real-world applications are to test. All UDP tests I did that could send some larger UDP packets (iperf & nc) seemed to behave relatively the same with the default MTU and the high MTU. The same means in both good & bad ways - it's still UDP ;) All default cluster services seem to run fine. The sonobuoy tests pass fine. If there's an issue, it seems to be subtle and likely limited to UDP. But I don't see anything. So for now I would say, let's merge it and test it a bit more with normal usage? The next release is still ~2 weeks away so there's hopefully plenty of time to test this and see unexpected breakage. Even if any breakage comes later, you can turn down the MTU size without having to destroy the cluster - you just need to shutdown all the machines, modify the VPC and turn the cluster back on. |
Proposed change(s)
This enhances general network performance inside Kubernetes with Cilium by ~4x.
(From ~1 Gbit/s to ~4 Gbit/s)
Note that Terraform's documentation on the
mtu
field is not correct that the maximum setting for MTU is 1500. I'll open a PR with them later to fix this.