-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Antrea wildcard fqdn netpolicy not working #3680
Comments
@Dyanngg could you help triage this issue? IIRC, wildcard rules rely on DNS response interception (instead of proactive querying), so there could be an issue with that code? |
@jsalatiel Actually could you please share the rules you use to allow Pods to resolve DNS in the issue? Those might very well be relevant. Also trying to understand that by |
Update: I have tried this on my own test setup and wasn't able to reproduce. @jsalatiel one thing I've noticed however is that you used
in the fqdn policy. Did you put the same appliedTo for the baseline deny-all policy? The reason I'm asking is, we need to make sure that the two-way communication between the client Pod and dns Pod is not dropped by the baseline deny rule.
|
Hi, @Dyanngg . These are my other policies:
That should allow both ingress/egress from/to kube-system and egress to kube-dns and nodelocaldns from anywhere.
169.254.25.10/32 is the nodelocaldns deployed by kubespray. (Can be related to the problem?) |
More debugging here. I have changed default DROP to default REJECT in the baseline tier to make sure I would have one single line on the np.log for the curl. For the following netpol ( changed from ANCP to ANP):
I get the output:
and the respective netpol.log:
When I change the netpol for:
I get the following output on curl:
and the following netpol.log
The only netpolicies I have are:
|
Hi @jsalatiel, |
@jsalatiel Since you are using nodelocaldns, I also have a few questions:
|
@Dyanngg This is the result from dig on the localdns IP from a container.
So apparently it is resolving and it is on UDP 53. Why would antrea intercept www.google.com correctly and not *.google.com if those are all DNS queries ? |
|
For policies with specific FQDN (as opposed to wildcard FQDNs), Antrea will directly contact dns server specified with env variable You can check this by looking at Antrea agent logs:
which should produce a log line like
|
@antoninbas I have added skipServices: ["kube-system/kube-dns"] , restart all pods , but I still get the same problem. So I think it is not related. |
@jsalatiel your answers to 2 & 4 are surprising to me. I would have expected the ClusterIP for CoreDNS there, even when using nodelocaldns. The dnsPolicy for the antrea-agent Pod is BTW, do you see anything interesting in the antrea-agent logs about DNS queries? |
Ignore this, I see that kubespray configures kubelet this way, so it is expected: |
@Dyanngg If you cannot reproduce after deploying NodeLocal DNSCache to your cluster, you may need to provision a cluster with Kubespray. |
@antoninbas The agent logs shows: |
This is not really an issue. Apparently kubespray uses a different name for the CoreDNS service. |
But it does explain why the policy works when |
|
@jsalatiel that would be what I mean by a different name. Most clusters I have encountered (including the ones provisioned by kubeadm) use |
I'm trying to repro the issue with a kubespray provisioned cluster. @jsalatiel Could you confirm that you installed Antrea on such kubespray cluster after uninstalling the original CNI (I see flannel as the default)? Just trying to get the same setup. |
@Dyanngg Use this override.yml and pass -e '@override.yml' in ansible command line. You will get exactly my setup. ( no default CNI )
|
Btw, The underlying OS in my kubespray cluster is AlmaLinux 8.5 |
I was able to reproduce this issue with a kubespray cluster with nodelocaldns enabled, with Antrea v1.6 build. Wildcard FQDN rule matching is made possible by Antrea installing a DNS reply packet interception rule at the highest priority in the Unfortunately with nodelocaldns, the dns query response packet matches this bypass flow and thus skipped the dns intercept flow. With PR #3510, the above mentioned flow is changed to match only non-reply packets, and thus will not bypass the dns reply packet for ingress tables. @hongliangl Since we have this bug that can be resolved with #3510, maybe we could backport it to v1.6? (special thanks to @antoninbas for help in debugging this issue) |
@hongliangl I approved both PRs. BTW, I think that it is possible to cherry-pick 2 separate changes with a single PR.
|
Thanks @antoninbas |
Are there plans to release 1.6.1? Or will the fix only be available on 1.7? |
@jsalatiel fix will be included in 1.6.1. Release will be late this week or next. |
@jsalatiel https://github.com/antrea-io/antrea/releases/tag/v1.6.1 has been released, which should have the fix. Please let us know if the issue is resolved. Other minor releases should have no this issue. |
Working perfectly. |
Describe the bug
According to the netpol documentation, one could use example like the following to match fqdn:
I also have a cluster netpol default-deny priority 999 on Baseline. So by default all traffic should be denied except traffic to google. The problem is if that I try to curl www.google.com from the container it is still being denied by the default-deny baseline rule. If I change the fqdn policy to allow "www.google.com" instead of "*.google.com" it does work, so for some reason the wildcard fqdn is not working.
To Reproduce
Expected
It should work
'
Actual behavior
The wildcard FQDN is not matching, only a expressed FQDN works.
Versions:
kubectl version
). 1.22.8Additional info: There are some other rules that allows the pods resolve DNS for example, but I removed those from the context because it is not related to the context of the problem.
The text was updated successfully, but these errors were encountered: