-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process packets only on the POSTROUTING hook (before SNAT) #54
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Change-Id: I774e00a799b9f790924468ece37ced419618c6a4
hmm, ipv6 flakes now 🤔 |
2 / 2 green with this PR https://github.com/kubernetes-sigs/kube-network-policies/actions Accepting always ICMPv6 ND seems to make it work reliable |
// IPv6 needs ICMP Neighbor Discovery to work | ||
tx.Add(&knftables.Rule{ | ||
Chain: chainName, | ||
Rule: knftables.Concat( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's more readable to write all-constant strings as one string (#49 (comment)):
tx.Add(&knftables.Rule{
Chain: chainName,
Rule: "icmpv6 type { nd-neighbor-solicit, nd-neighbor-advert } accept",
})
(same for other similar cases)
Beside the nit this lgtm. For documentation, this is how it looks now:
I tested e2e in dual-stack with IPv6 as "base family" on K8s v1.30.3. |
An unrelated comment: when building a |
Thanks Lars, do you mind following up on those nits? I'd like to get a release with ipvs working and the nftables initialization fix |
This appears to break my setup with kube-proxy in nftables mode and flannel with I haven't had time to dig in to the issue yet but figured I'd leave a comment in case this wasn't tested with kube-proxy in nftables mode. |
My mistake, sorry. I ran on k8s v1.30.3 without the nftables feature-gate. I ran again on v1.31.0-beta.0 and the e2e tests pass with proxy-mode=nftables |
Once we have support in kind we can have CI in Kubernetes master kubernetes-sigs/kind#3612 |
Hold that, please see the updated #54 (comment) |
This is how it looks with proxy-mode=nftables (narrowed to postrouting chains):
The netpol chain already has a higher prio @vaskozl What problem do you see? The netpol e2e is only cluster internal (I think). Do you have problems with external traffic? |
Sorry I wasn't sure there was an issue and didn't have details hence I didn't want to create a potentially false issue if nftables tested successfully: In addition to the two tables you listed, I have the following
I tried upgrading twice and within seconds I see failed liveness probes across all pods and failed image pulls. External connections from within the cluster appear to timeout too. Downgrading required full node restarts. It is a very weird issue indeed, given that for some reason I required a restart (I did check that the forward rules were reapplied on revert). I can look to debug more in the weekend and will create a propper issue with details if I find anything wrong. |
no worry about that, feel free to open issues like that to discuss, I'm not worried about issues that turn out to be environment problem for people like you that spend the time to discuss and to troubleshoot, this feedback is very welcome. I'm going to setup a job to test nftables so we can rely on automated testing to validate nftables functionality |
This is interresting. It might be that the netpol postrouting chain interfere in some way with external traffic. It make sense to me since masquerading should affect pod-IP -> non-pod-IP traffic only. I tested with the versions below, and Netpol e2e passes.
BTW "BaseFamily=IPv6" meaning that the default family for services is IPv6 in a dual-stack cluster:
|
🤔 but then external traffic problems should be a v0.5.0 issue, rather than an nftables issue. I'll make some tests with outgoing traffic from pods. I have no image load problems though. |
@vaskozl What kind of netpol rules do you use? Most logical to me is that egress blocking rules would be causing issues for external traffic, but I may be wrong. In my own tests I use ingress blocking rules, so I want to extend with relevant rules. |
I added TCP liveness/readiness probes, and if I apply my "deny ingress" rule, the probes fail. This make sense to me, a deny ingress should block the probes. @aojea Do you know if there is some "don't do that" for this situation documented somewhere. Please, don't search for it, I can do that myself, but if you happen to know... |
I don't have any egress rules but apply traditional (not banp/anp) ingress netpols to most everything in the cluster. Afaik Host to local workload traffic should not be blocked by network policies and isn't by all CNIs that support netpols. I use 1.30 with flannel in 0.25.3 and ipv4 single stack. I'll make an issue after I debug asap. |
I backed down to v0.4.0, and TCP probes works with a "deny-all-ingress" rule applied. @aojea I think this is the root of the problem |
You are right. v0.4.0, nor Calico or Cilium netpol does that |
I can't explain the failing image loads though. |
That may have been a knock on of the whole cluster liveness probes causing restarts and some rolling restarts I issued that slowed down my kublet and as such and indirect observation on my relatively weak setup. Otherwise maybe hostNetwork pods that I also have rules on? I'll try to have more accurate information about the failure soon. |
@danwinship is the source of truth for network policies 😄 and has also revisited the problem of probes recently, it also mentions this in the KEP kubernetes/enhancements#4558
I think that traffic originated from the node should not be subject for network policies. I checked with nftables trace and all trafic coming from the node has the
|
For the record; this is not a proxy-mode=nftables problem |
yeah, opened a new issue so we have this well documented #65 , please continue the conversation there, also submitted a PR and tagged you both for feedback |
We only need to process packets that are destined to Pods, never destined to the local host.
Based on the netfilter diagram https://wiki.nftables.org/wiki-nftables/index.php/Netfilter_hooks this can be on FORWARD and POSTROUTING.
However, FORWARD does not work with IPVS, as it seems the IPVS kernel logic follows a different path http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.filter_rules.html
Using OUTPUT forces us to discriminate the traffic that is destined to localhost to avoid being impacted by network policies, usually adding a rule to accept packets to
lo
.Using POSTROUTING before SNAT happens (to avoid to loose the original source IP) seems the right place
Fixes: #46