-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support DSR mode for LoadBalancerIPs with AntreaProxy #5025
Comments
@antoninbas I tried to reproduce the latency of learned flow but it seemed working fine even the first packet and the second packet had a very small interval:
And the packet counter confirmed the subsequent packets always hit the learned flow:
10,000 connections succeeded 100% when there are two different backends on different nodes even I changed the So I suspect the original issue has been fixed in OVS 2.17.6. Do you have a way to reproduce the original issue? |
@tnqn My best guess is that the issue doesn't happen for packets belonging to the same connection? One way to confirm this would be to update your learned flow so that it doesn't match on the source port, then trigger several back-to-back connections. |
@antoninbas Yes, after I removed the source port in the learned flow, the 2nd or the 3rd connection had a great chance to fail, which means the delay still exists. Checked OVS doc about how OVS selects a bucket, I guess it's not due to microflow cache but because the selection is based on 5-tuple.
For SessionAffinity, because the learned flow has only source IP but not source port, the second connection using different source port may get a different bucket when the learned flow is not realized. For DSR: when the learned flow is per src port, regardless of its existence, all packets of a connection will always get the same bucket; when the learned flow is per src IP, subsequent packets of a connection will get the same bucket as the first packet before the learned flow is realized and may change to another bucket after the learned flow is realized. So there should be no problem as long as we keep source port in the learned flow. I also found something more interesting: if I change the selection method from
Besides, I found the reason why running benchmark with new connections in DSR mode had worse performance than normal mode: inserting a learned flow to datapath incurs more latency. After I reduced the number of learned flows by masking the src ports to ensure at most 64 flows will be learned, the performance became much better:
I'm going to implement DSR with the revised flows if there is no other problem. |
I am not sure I understand this comment, unless you are talking about multiple connections, with different source ports. I believe there is a significant different between how This is from an old patch for
And the OVS manual entry for
Also from the OVS manual, we have this for
Which seems to be a reference to one drawback of using My takeaways from all this are as follows:
With regards to the scalability issues of
The scalability impact is the same for So only the increased latency may be of concern? Footnotes |
I meant:
I think the scalabity issue and the higher latency refer to the fact that the
It seems to me that 3 is the most appropriate approach which has nearly best performance, generats reasonable flows, loads balance evenly relatively. |
That's good data. I agree that Option 3 looks like a good approach.
It feels to me that there are 2 flows being installed, one caused by the usage of |
There are indeed 2 flows being installed for each connection, but caused by two different ct_states:
For the second packet of a connection, it's "-new+inv+trk", it won't match the above flow and will be upcalled due to the usage of
And because the second packet is upcalled, the learned flow of the first packet will apply to it, so it's not subject to revalidator processing, and it triggers installing the learned flow to datapath immediately. In theory, we can change some flows to avoid the datapath flows being generated with ct_states, then even the second packet of the first connection won't be upcalled and generate another datapath flow. |
Thanks @tnqn, things are much clearer now. Based on your explanation, if we replace dp_hash with hash for the SessionAffinity implementation, I am fine with reverting the However, I assume we should stick with dp_hash for the default Service case? One more question: in your table, for the hash, do you always use src IP + masked port? I am surprised we have so many flows in the first case, "hash, no learn action" (17537). |
Yes,
All the tests in the table were executed without using masked port in |
That explains the large number of flows. I am surprised the latency is not a bit higher for 1 given that each new connection needs to be upcalled (btw, what's the unit for For our DSR use case (test 3 in the table), do you think it makes any difference to mask the source port in the |
The unit test is ms. The latency when each new connection needs to be upcalled (Case 1, 2, 6) is indeed higher than the others (Case 3, 4, 5).
There is no obvious performance difference according to tests. I thought there could be issues if 2 connections with the same masked source port are established simultaneously when we don't mask the source port with
|
That's also what I thought could happen
You mean, when running Let's mask the source port in |
Yes, I tried
Sure. |
Describe what you are trying to solve
See #4956 for the original issue.
In DSR (Direct server return) mode, the load balancer routes packets to the backends (without changing src/dst IPs in it typically). The backends process the requests and answer directly to the clients, without passing through the load-balancer.
Pros:
Use case: in-cluster LoadBalancers, e.g. Metallb, Antrea ServiceExternalIP
Describe the solution you have in mind
The diagram shows how the traffic flows in DSR mode:
The tricky part is how to persist loadbalancing result of the first packet of a connection on ingress Node, considering the following caveats:
Potential Solution
Example Flows with learn actions:
Describe how your solution impacts user flows
User can set the LoadBalancer mode to DSR in antrea-agent config. Then LoadBalancerIPs of Services will work in DSR mode.
Test plan
The text was updated successfully, but these errors were encountered: