Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit new connections after NetworkPolicy check #220

Merged
merged 1 commit into from
Dec 17, 2019

Conversation

wenyingd
Copy link
Contributor

Add conntrackCommit table(#105) between ingressDefault table(#100) and
L2ForwardingOut(#110) table, and packets in the new connections are committed
in this table. There is no table-miss flow in this table.

Fixes #219

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests. This command can only be run by members of the vmware-tanzu organization
  • /skip-e2e: to skip e2e tests. This command can only be run by members of the vmware-tanzu organization

@wenyingd
Copy link
Contributor Author

/test-e2e

@wenyingd wenyingd changed the title Commit new connections after NetworkPolicy check [WIP]Commit new connections after NetworkPolicy check Dec 13, 2019
@wenyingd wenyingd force-pushed the issue219 branch 2 times, most recently from 1d11714 to 4c13829 Compare December 13, 2019 10:18
@wenyingd
Copy link
Contributor Author

/test-e2e

@antoninbas
Copy link
Contributor

@wenyingd It seems that this didn't fix the test issue. Is there anything else at play here?

@wenyingd
Copy link
Contributor Author

wenyingd commented Dec 16, 2019

Hi @antoninbas it doesn't work as we expected when we change to commit the connection at the end of the pipeline. In the change, we have to also commit the traffic from Pod to Service, otherwise the reply packets could be sent back to Pod without conntrack support. We could ensure the commit is taken after NetworkPolicy check for Pod to Pod traffic, but it doesn't suit for Service traffic. For we could not predict the final action is to drop to allow for Service traffic before iptables NAT is taken.
So we have to commit all Service traffic.

@antoninbas
Copy link
Contributor

@wenyingd I understand your comment. I'm fine with committing both connections (pre-DNAT and post-DNAT). I am not sure whether there is an alternative solution. Maybe we could accept all traffic if the source IP is in the service CIDR? I'm not sure it's better though...

I tested your changes locally and they work. The Kind test is failing for your Pod->Pod traffic test because runWgetCommandFromTestPod tries to do a DNS lookup on the Pod name which is not supported (only works for service name). If you hardcode the Pod IP it should work.

BTW, I thought of another potential fix to the original issue: we could use +rpl instead of +est in the conntrack rules for network policies. However, it just hides the problem instead of fixing it. What we have now (commit after applying NP rules) is better IMO.

@wenyingd
Copy link
Contributor Author

@antoninbas Thanks for verifying the change. I don't think +rpl could work in the NP rules, for it only represents the reply packets(ACK), and not include the later packets sent from the original sender. According to OVS, once a connection is established, +est should work for all later packets. So I perfer using "+est" to mark the accepted connections.

I change the UT to leverage Pod is because origianlly it has failed when the test case is using Service address. And we doubt the reason is we have committed all connections destined at the Service to the ct, no matter it would be dropped by NP rule or not. And I think our design for committing the connection after NP rule check might not work for the Service case, at least not work for the Service if it is depending on iptables but not Openflow.

I don't understand "Maybe we could accept all traffic if the source IP is in the service CIDR?", for I don't think there is any traffic using the service CIDR as source IP.

@antoninbas
Copy link
Contributor

See replies inline

@antoninbas Thanks for verifying the change. I don't think +rpl could work in the NP rules, for it only represents the reply packets(ACK), and not include the later packets sent from the original sender. According to OVS, once a connection is established, +est should work for all later packets. So I perfer using "+est" to mark the accepted connections.

We only need it to work for reply packets. subsequent packets from the sender can still go through the regular flows (with no performance penalty IMO). Actually that may be a good thing because right now if a network policy is updated while a long-lived connection has already been established, the new network policy will never be applied to the existing connection. We discussed this with @salv-orlando and we thought it could be better to have a way to enforce new network policies to existing connections.

I change the UT to leverage Pod is because origianlly it has failed when the test case is using Service address. And we doubt the reason is we have committed all connections destined at the Service to the ct, no matter it would be dropped by NP rule or not. And I think our design for committing the connection after NP rule check might not work for the Service case, at least not work for the Service if it is depending on iptables but not Openflow.

Confirmed that the test case passes when using Pods directly with the following change:

diff --git a/test/e2e/framework.go b/test/e2e/framework.go
index c22f9fb..669543b 100755
--- a/test/e2e/framework.go
+++ b/test/e2e/framework.go
@@ -780,7 +780,7 @@ func (data *TestData) runPingCommandFromTestPod(podName string, targetIP string,
 }
 
 func (data *TestData) runWgetCommandFromTestPod(podName string, svcName string) error {
-       cmd := []string{"nc", "-vz", "-w", "8", svcName + "." + testNamespace, "80"}
+       cmd := []string{"nc", "-vz", "-w", "8", svcName, "80"}
        _, _, err := data.runCommandFromPod(testNamespace, podName, busyboxContainerName, cmd)
        return err
 }
diff --git a/test/e2e/networkpolicy_test.go b/test/e2e/networkpolicy_test.go
index aa6bfc9..9852aad 100755
--- a/test/e2e/networkpolicy_test.go
+++ b/test/e2e/networkpolicy_test.go
@@ -102,12 +102,13 @@ func TestIPBlockWithExcept(t *testing.T) {
                }
        }()
 
+       nginxPodIP, _ := data.podWaitForIP(defaultTimeout, nginxPodName)
        // pod0 can wget to service.
-       if err = data.runWgetCommandFromTestPod(podName0, nginxPodName); err != nil {
+       if err = data.runWgetCommandFromTestPod(podName0, nginxPodIP); err != nil {
                t.Fatalf("Pod %s should be able to connect Pod %s, but was not able to connect", podName0, nginxPodName)
        }
        // pod1 cannot wget to service.
-       if err = data.runWgetCommandFromTestPod(podName1, nginxPodName); err == nil {
+       if err = data.runWgetCommandFromTestPod(podName1, nginxPodIP); err == nil {
                t.Fatalf("Pod %s should not be able to connect Pod %s, but was able to connect", podName1, nginxPodName)
        }
 }

I don't understand "Maybe we could accept all traffic if the source IP is in the service CIDR?", for I don't think there is any traffic using the service CIDR as source IP.

When reply traffic goes through kube-proxy, the source IP gets rewritten to the service IP, so that the 5-tuple is correct when the senders receives the reply traffic. For example, when you capture traffic on the gw interface you may see that:

04:01:21.282628 IP 10.10.1.14.80 > 10.10.1.15.57838: Flags [P.], seq 1:233, ack 77, win 502, options [nop,nop,TS val 704332598 ecr 3004701775], length 232: HTTP: HTTP/1.1 200 OK
04:01:21.282639 IP 10.108.139.23.80 > 10.10.1.15.57838: Flags [P.], seq 1:233, ack 77, win 502, options [nop,nop,TS val 704332598 ecr 3004701775], length 232: HTTP: HTTP/1.1 200 OK

@antoninbas
Copy link
Contributor

At this stage it seems that this patch (commit connection to Pods after network policy enforcement) is a good workaround for the issue with the netdev implementation of conntrack, so unless I am misunderstanding something I believe we can merge it. The next step would be to reproduce the issue with a simple 2-table pipeline using only the OVS command-line tools, so we can report the issue to the OVS project.

Add conntrackCommitTable(antrea-io#105) between ingressDefaultTable(#100) and
L2ForwardingOutTable(antrea-io#110), and packets in the new connections are committed
in this table.
Packet could be resubmitted to conntrackCommit table from DNATTable and
ingressDefaultTable.
@wenyingd
Copy link
Contributor Author

/test-e2e

@wenyingd wenyingd changed the title [WIP]Commit new connections after NetworkPolicy check Commit new connections after NetworkPolicy check Dec 17, 2019
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas merged commit 2a14d18 into antrea-io:master Dec 17, 2019
@wenyingd wenyingd deleted the issue219 branch April 24, 2020 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NetworkPolicy doesn't work in Kind cluster because of a netdev datapath issue
4 participants