-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation for Antrea's OVS pipeline #206
Add documentation for Antrea's OVS pipeline #206
Conversation
Thanks for your PR. The following commands are available:
|
docs/ovs-pipeline.md
Outdated
|
||
All traffic is finally resubmitted to the [DnatTable]. | ||
|
||
### DnatTable (40) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably: "DNATTable"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
docs/ovs-pipeline.md
Outdated
* *table miss-miss flow entry*: a "catch-all" entry in a OpenFlow table, which | ||
is used if no other flow is a match. If the table-miss flow entry does not | ||
exist, by default packets unmatched by flow entries are dropped (discarded). | ||
* *conjuctive match fields*: an efficient way in OVS to implement conjunctive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conjuctive -> conjunctive
docs/ovs-pipeline.md
Outdated
``` | ||
|
||
After this table, ARP traffic is resubmitted to [ARPResponderTable], while IP | ||
traffic is resubmitted to [ConnectionTrackingTable]. Traffic which does not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConnectionTrackingTable -> ConntrackTable
docs/ovs-pipeline.md
Outdated
action, then the packets in the flow go through the switch in the same way | ||
that they would if OpenFlow was not configured on the switch. Antrea uses this | ||
action to process ARP traffic as a regular learning L2 switch would. | ||
* *table miss-miss flow entry*: a "catch-all" entry in a OpenFlow table, which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean table-miss flow entry
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
docs/ovs-pipeline.md
Outdated
This table complements [EgressRuleTable] for Network Policy egress rule | ||
implementation. In K8s, when a Network Policy is applied to a set of Pods, the | ||
default behavior for these Pods become "deny" (it becomes an [isolated | ||
Pod](https://kubernetes.io/docs/concepts/services-networking/network-policies/#isolated-and-non-isolated-pods). This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pod](https://kubernetes.io/docs/concepts/services-networking/network-policies/#isolated-and-non-isolated-pods). This | |
Pod](https://kubernetes.io/docs/concepts/services-networking/network-policies/#isolated-and-non-isolated-pods)). This |
Otherwise it appears as ... "deny" (it becomes an isolated Pod. This ...
docs/ovs-pipeline.md
Outdated
can go through. | ||
|
||
The rest of the flows read as follows: if the source IP address is in set | ||
{10.10.1.2, 10.10.1.3}, and the destination port is in the set {3, 4} (which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe "the destination OF port" to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
docs/ovs-pipeline.md
Outdated
|
||
In the future this table may support an additional mode of operations, in which | ||
it will implement kube-proxy functionality and take care of performing | ||
laod-balancing / DNAT on traffic destined to services. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
laod -> load
docs/ovs-pipeline.md
Outdated
|
||
If the `conjunction` action is matched, packets are "allowed" and resubmitted | ||
directly to [L3ForwardingTable]. Other packets go to [EgressDefaultTable]. If a | ||
connection is established - as a remainder all connections are committed in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remainder?
docs/ovs-pipeline.md
Outdated
``` | ||
1. table=60, priority=200,ip,nw_src=10.10.1.2 actions=drop | ||
2. table=60, priority=200,ip,nw_src=10.10.1.3 actions=drop | ||
3. table=60, priority=80,ip actions=resubmit(,70) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the sake of completeness, as you have done throughout this document, also include one line about this resubmit rule..
docs/ovs-pipeline.md
Outdated
can go through. | ||
|
||
The rest of the flows read as follows: if the source IP address is in set | ||
{10.10.1.2, 10.10.1.3}, and the destination port is in the set {3, 4} (which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
docs/ovs-pipeline.md
Outdated
### IngressDefaultTable (100) | ||
|
||
This table is similar in its purpose to [EgressDefaultTable], and it complements | ||
[EgressRuleTable] for Network Policy egress rule implementation. In K8s, when a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be [IngressRuleTable]
docs/ovs-pipeline.md
Outdated
### IngressDefaultTable (100) | ||
|
||
This table is similar in its purpose to [EgressDefaultTable], and it complements | ||
[EgressRuleTable] for Network Policy egress rule implementation. In K8s, when a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
egress rule impl.. -> ingress rule impl..
docs/ovs-pipeline.md
Outdated
[EgressRuleTable] for Network Policy egress rule implementation. In K8s, when a | ||
Network Policy is applied to a set of Pods, the default behavior for these Pods | ||
become "deny" (it becomes an [isolated | ||
Pod](https://kubernetes.io/docs/concepts/services-networking/network-policies/#isolated-and-non-isolated-pods). This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-isolated-pods). -> non-isolated-pods)).
docs/ovs-pipeline.md
Outdated
Network Policy is applied to a set of Pods, the default behavior for these Pods | ||
become "deny" (it becomes an [isolated | ||
Pod](https://kubernetes.io/docs/concepts/services-networking/network-policies/#isolated-and-non-isolated-pods). This | ||
table is in charge of dropping traffic originating from Pods to which a Network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be traffic destined to .. since it is ingress rule
|
||
## Tables | ||
|
||
![OVS pipeline](/docs/assets/ovs-pipeline.svg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the diagram i noticed an arrow from table70 -> table60.. i believe it should be the other way around
@@ -0,0 +1,561 @@ | |||
# Antrea OVS Pipeline | |||
|
|||
## Terminology |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe somewhere in this doc add the ovs-ofctl command used to dump the flows?
docs/ovs-pipeline.md
Outdated
This table handles all "tracked" packets (all packets are moved to the tracked | ||
state by the previous table, [ConntrackTable]). It serves the following | ||
purposes: | ||
* keeps track of connections going through the gateway port; for all packets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got this wrong: this mechanism applies to reverse traffic from a tunnel as well, not just from local backend Pods. @wenyingd could you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConntrackStateTable also commits packets from the tunnel port into the ct_zone. It works by flow "table=31, priority=190,ct_state=+new+trk,ip actions=ct(commit,table=40,zone=65520)"
document](http://docs.openvswitch.org/en/latest/tutorials/ovs-conntrack/) for | ||
more information on connection tracking in OVS. | ||
|
||
### ConntrackStateTable (31) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be updated because of #213
docs/ovs-pipeline.md
Outdated
action, then the packets in the flow go through the switch in the same way | ||
that they would if OpenFlow was not configured on the switch. Antrea uses this | ||
action to process ARP traffic as a regular learning L2 switch would. | ||
* *table miss-miss flow entry*: a "catch-all" entry in a OpenFlow table, which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
docs/ovs-pipeline.md
Outdated
that they would if OpenFlow was not configured on the switch. Antrea uses this | ||
action to process ARP traffic as a regular learning L2 switch would. | ||
* *table miss-miss flow entry*: a "catch-all" entry in a OpenFlow table, which | ||
is used if no other flow is a match. If the table-miss flow entry does not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is used if no other flow is matched"?
docs/ovs-pipeline.md
Outdated
is "known", i.e. corresponds to an entry in [L2ForwardingCalcTable], which is | ||
essentially a "dmac" table. | ||
* reg1 (NXM_NX_REG1): it is used to store the egress OF port for the packet and | ||
is set by [L2ForwardingCalcTable]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/ovs-pipeline.md
Outdated
2. table=0, priority=200,in_port=tun0 actions=load:0->NXM_NX_REG0[0..15],resubmit(,30) | ||
3. table=0, priority=190,in_port="coredns5-8ec607" actions=load:0x2->NXM_NX_REG0[0..15],resubmit(,10) | ||
4. table=0, priority=190,in_port="coredns5-9d9530" actions=load:0x2->NXM_NX_REG0[0..15],resubmit(,10) | ||
5. table=0, priority=80,ip actions=resubmit(,10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flow(no: 5) might be removed after PR: #199 is checked in, and the action of the table-miss should be drop.
docs/ovs-pipeline.md
Outdated
If you dump the flows for this table, you should see something like this: | ||
``` | ||
1. table=40, priority=200,ip,nw_dst=10.96.0.0/12 actions=output:gw0 | ||
2. table=40, priority=80,ip actions=resubmit(,50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
priority=80 will be replaced by priority=0 in all tables once PR: #195 is merged.
docs/ovs-pipeline.md
Outdated
If the `conjunction` action is matched, packets are "allowed" and resubmitted | ||
directly to [L3ForwardingTable]. Other packets go to [EgressDefaultTable]. If a | ||
connection is established - as a remainder all connections are committed in | ||
[ConntrackStateTable] - its packets go straight to [L3ForwardingTable], with no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean "Established" connections?
It seems that all the patches to the OVS pipeline have been merged, so I'll update my PR to reflect the latest changes. |
5a1174b
to
3a34ae6
Compare
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
3a34ae6
to
46db62e
Compare
I addressed review comments and updated the doc to reflect the latest OVS pipeline. PTAL. |
Any chance we can review / merge this? |
Sorry, I did not review yet. IPSec introduced some changes: the IPSec tunnel itself which we could document separately, and the change to load tunnel ofport in the L3 table and skip L2 and ingress policy tables. Check the commit description: 0d2e4d9 @wenyingd could comment on any other changes. |
@jianjuns For IPSec support, I am also leaning towards a separate document or a future PR. I will update this PR to indicate that some flows are different when IPSec is enabled and to take into account the new table bypass for tunnelled traffic. |
@antoninbas: what I mean is the IPSec PR also changes the L3 flows for the normal tunnels. You might want to include that part into your doc. |
46db62e
to
bba519a
Compare
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
@jianjuns yes that's what I meant. I have updated the document to account for the changes for normal tunnels. |
bba519a
to
1eb0978
Compare
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
Thanks for the review @wenyingd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, expect for one comment.
docs/ovs-pipeline.md
Outdated
|
||
As for [EgressRuleTable], flow 1 (highest priority) ensures that for established | ||
connections - as a remainder all connections are committed in | ||
[ConntrackStateTable] - packets go straight to [L2ForwardingOutTable], with no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all connections are committed in [ConntrackCommitTable]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for catching this.
Some detailed documentation for the OVS pipeline, including a description of each table. This is directed at developers and people trying to troubleshoot issues. It includes a SVG high-level diagram of the pipeline. We use SVG directly so it renders better on all screens and to avoid having to check-in a "large" PNG image that may need to be updated often. More documentation specific to the Network Policy implementation will follow later. Fixes antrea-io#27
1eb0978
to
b0484c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
/skip-all |
Some detailed documentation for the OVS pipeline, including a
description of each table. This is directed at developers and people
trying to troubleshoot issues.
It includes a SVG high-level diagram of the pipeline. We use SVG
directly so it renders better on all screens and to avoid having to
check-in a "large" PNG image that may need to be updated often.
More documentation specific to the Network Policy implementation will
follow later.
Fixes #27