-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UDP proxying support #492
Comments
There is not currently any idle timer in the tcp_proxy filter. This is a useful feature to add in either case, and we would want this for UDP.
Agreed, almost all of the code can be shared. The name of the "tcp_proxy" filter is unfortunate. I don't know if I would bother renaming it right away. We can do that in a dedicated change if we want.
This is related to what @jamessynge was asking about in terms of why the Address interface also includes socket stuff. I mainly did this for simplicity. Ultimately, for UDP upstreams, we would like the ability to specify the upstream probably as udp://1.2.3.4:80 in terms of the cluster definitions, CDS, etc. Given the current code, the simplest way to do this would be have the Address interface also hold the socket type (as you mentioned in Gitter), and remove this parameter from the various socket related functions. The alternative would be to split the Address interface out and have an Address and a SocketAddress, where a SocketAddress contains an Address. I could really go either way on this. I don't think it's a huge deal.
Per above, I don't think the filter needs to do anything different whatsoever than it does today. The code can pretty much be identical, along with an idle timer to destroy things. I think where you have to deal with UDP is probably inside ConnectionImpl. You are going to need to know that it is UDP and deal with MTU there. Doing anything else will be too complicated I think.
I don't think you need to worry about this. We will need to have UDP listeners, which bind, and have a filter stack. All of the normal rules then apply for where to forward. Along these lines, we are going to need to make the listener configuration more extensible. Right now we just support "port". I would like to extend this to be something along the lines of:
Doing the above will make it easy for schemas, allow us to have pipe listeners, do IPv6, etc. In general I would appreciate it if you could sync up with @jamessynge on all of this, as I think it's related to IPv6 stuff, as well as future work I know probably needs to happen around QUIC, etc. |
@moderation ^^^ Can you provide any info on the specifics of what scenario you need to support in terms MTU handling, etc.? Want to make sure we are hitting a specific use case. @rshriram if we do this, I would like to do this in several different changes. For example, we could start with adding UDP listeners, which proxy to TCP. That is a pretty straightforward change and is independent. |
I agree with splitting this into multiple PRs. There are small changes to different subsystems. We should do this piecemeal to make sure we can triage issues easily. I am unfamiliar with the requirements on QUIC. With regard to the bind_config, it looks very structured. But couple of questions: why do we need to key off address as well, when port is what matters? (is this related to the issue that @kyessenov posted?) Secondly, will this config be backward compatible with existing configs ? because, it seems to break the config format. Here is an alternate format (I am okay with either one frankly). listeners: [
{
"port": 80
"port_type": "udp|tcp" [tcp is default]
...
} |
Has there been any progress made on this effort? Is it in active development or open to contribution? |
No one is working on this that I know of. This did come up today in the context of something that would be good to work on. This is actually a fairly complicated feature and needs some thinking. @shalako can you provide more color on what you need here actually? Do you just need UDP -> UDP? UDP -> TCP? Should datagram boundaries be preserved? Etc. |
@shalako can keep me honest here, but I suspect our expectations are:
|
I’m interested in moving this forward. Are there more pertinent discussions that I should take a look at before starting some of the changes mentioned above? |
@cmluciano I don't think anyone is actively working on this. This one probably is best served by a short design doc (1-2 pages, nothing fancy). Do you want to browse the code and then maybe we can collaborate on the doc contents? Would love to get this being worked on. FYI there is some interest from Cisco in also helping out with this but unclear on when they would have time. I think we can get started if you have cycles. |
FWIW; I'm interested in this work for the purpose of proxying SIP traffic and RTP media streams in and out of a k8s kluster. When appropriate, I'll be happy to assist with setting up some services, and doing some testing, if that's helpful to you @cmluciano & @mattklein123 |
@mattklein123 Sounds good to me. I will take a look through the codebase and let you know when I'm ready for the doc. @jevonearth Thanks! I will ping you when ready |
Let me ask a few questions here about functional behavior that I don't see in the issue yet. I presume we want to be able to specify: udp://${proxyip}:${proxyport} -> udp://${proxiedToIp}:${proxiedToPort} correct? So a packet's headers would be transformed like this: dstip = ${proxyIp} -> ${proxyToIp} and going the other way: dstip = ${proxyToIp} -> ${clientIp} Is that the desired behavior? |
@hagbard5235 ^ is my assumption, but part of the reason that I think we need a design doc on this one is that it's honestly not clear to me exactly what the behavior should be. For example, it's easy enough to fit UDP into Envoy filter chain semantics by raising onData() for each datagram, but what if the user tries to send a datagram that is too large for the target MTU? (Either because path MTU does not match, or we are doing TCP -> UDP). Also, there are some thorny issues around listening for UDP datagrams and the Envoy threading/filter model that need to be thought through. |
Oh good... so I wasn't the only one not seeing clarity then ;) Sounds like there's a desire to do TCP -> UDP and UDP -> TCP proxying as well. Does anyone have an example use case for those transitions? I'm curious how we anticipate them being used :) |
Is it ok to start with UDP -> UDP and to allow packets to fragment on the way out, if there's an MTU issue? |
No idea if this is needed or not, I just want to make sure we consider it in the design and exclude with appropriate thinking. Either way the MTU mismatch issue and threading issues will need to be dealt with. |
Fragging may not be supported in the environment. In v1 we can likely ignore MTU issues and just document. This leaves threading. Anyway, just want to capture all of this in the design. :) |
@mattklein123 I'm cool capturing TCP -> UDP and UDP -> TCP in the design :) I was asking because if one has more concrete examples available it often helps in the design process :) Question, are we doing a pure UDP proxy (ie, we make our decisions on ip proto=UDP and port) or and purely mutate ip:port fields or are we looking to look into the datagrams to make proxy decisions? #socraticdesign ;) |
My thinking here was to go for full L4 proxy. Basically something like: Using this model, the "tcp_proxy" I think should mostly "just work" modulo some minor changes. For QUIC, in the future, we will have needs to do some pure L4 proxying, but IMO we should try to actually fit this within the existing filter model as much as possible. The main issue that we have to solve (that I don't know answer to off the top of my head) is how to route the UDP packets between multiple threads. Basically, a connection today is bound to a thread along with its filters. This breaks down for incoming UDP packets that are not part of a connection. E.g., do we have all workers listen for packets and somehow forward? Only initially support UDP w/ 1 worker? etc. |
I am not sure how threading becomes an issue. For first cut, it’s basically
a dumb datagram proxy.
A more fundamental issue is the semantics. Are we going to load balance per
datagram? Seems strange. We might need to reuse the Ketama hash or the ip
hash and send packets to same destination host. Any other load balancing
algorithm seems unintuitive imo.
The cluster will have to change as well. It’s wedded to stream semantics in
terms of circuit breakers. Notion of failure of a host is not going to work
given that it’s datagrams that we are sending (fire and forget). So things
like outliers, panic thresholds, etc. are out of the window.
A straw man impl would just take a watered down version of tcp proxy, and
hardwire it to a ip hash based cluster where everything related to
reliability is turned off.
@grosenhouse would this be a sufficient first cut for CF?
…On Tue, Nov 7, 2017 at 6:40 PM Matt Klein ***@***.***> wrote:
allow packets to fragment on the way out
Fragging may not be supported in the environment. In v1 we can likely
ignore MTU issues and just document. This leaves threading. Anyway, just
want to capture all of this in the design. :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#492 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AH0qd7aWNi7HMUhxn-rlv_YIu1xWlVK8ks5s0OpVgaJpZM4MDq2C>
.
|
This is the first commit in a series to support UDP proxying. There are quite a few TODOs in the code before this feature will be considered a MVP. Part of #492
This is the first commit in a series to support UDP proxying. There are quite a few TODOs in the code before this feature will be considered a MVP. Part of envoyproxy/envoy#492 Mirrored from https://github.com/envoyproxy/envoy @ 477fafdaa8423cff1a5c22d58904c22eed9155f3
Exciting to see the udp_proxy scaffolding get merged! 🎉 What's next? Is the roadmap at the top of the issue directionally accurate? |
Another bunch of work towards #492. The remaining work is proper wiring up of upstream cluster management, host health, etc. and documentation. This will be done in the next PR. Signed-off-by: Matt Klein <[email protected]>
Fixes #492 Signed-off-by: Matt Klein <[email protected]>
MVP complete pending code reviews here #492 if anyone wants to kick the tires. |
Another bunch of work towards #492. The remaining work is proper wiring up of upstream cluster management, host health, etc. and documentation. This will be done in the next PR. Signed-off-by: Matt Klein <[email protected]>
Fixes #492 Signed-off-by: Matt Klein <[email protected]>
Another bunch of work towards envoyproxy/envoy#492. The remaining work is proper wiring up of upstream cluster management, host health, etc. and documentation. This will be done in the next PR. Signed-off-by: Matt Klein <[email protected]> Mirrored from https://github.com/envoyproxy/envoy @ 647c1eeba8622bafdd6add1e7997c1f0bda31be5
Move wasm api v3 into extensions, remove unused v2 (not upstreamed).
Description: create a class to manage Envoy lifecycle and allow for graceful teardown. Teardown is now graceful for both iOS and Android. Envoy's codebase expects to have shutdown happen from the main thread. However, in mobile the thread the engine runs in is not the main thread of the process, and thus shutdown/destructors were not getting run from the thread the engine was started on. This PR makes sure that the engine is destructed/shutdown from the thread that it ran in. Risk Level: high. This PR changes state management for the engine, and initializes it on a std::thread instead of a platform thread. iOS thread management is afaict handled gracefully. On Android the native thread has to be attached and detached from the JVM. This PR attaches the native thread, but the work to detach is a bit involved so will come in a subsequent PR. Testing: local device testing. Fixes #492 Co-authored-by: Jose Nino <[email protected]> Signed-off-by: Mike Schore <[email protected]> Signed-off-by: JP Simard <[email protected]>
Description: this is a follow up to #498. This PR introduces `envoy_engine_callbacks`. They are similar in nature to envoy_http_callbacks. The difference being that they are not exposed all the way to consumer level in the library as it is not needed right now. However, one can see how by adding a type erased context pointer, and following the platform patterns for http callbacks we could thread this all the way up if need be. The immediate need for these callbacks is to detach the engine's native thread from the JVM on Android. Risk Level: med -- adds complexity to engine management. Testing: local testing on devices (Lyft and example app on iOS and Android). In conjunction with #498 this PR Fixes #492 #445 Signed-off-by: Jose Nino <[email protected]> Signed-off-by: JP Simard <[email protected]>
Description: #498 did not fully solve #492. The reset cleanly destructed all objects. However, because destruction was posted in to the event dispatcher, the event dispatcher was left with bad accesses. This PR fixes the issue by issuing shutdown on the dispatcher, and only destructing once the event loop has exited and control has returned to the Engine's run function. Risk Level: med - fixing crash on shutdown Testing: local Fixes #492 Signed-off-by: Jose Nino <[email protected]> Signed-off-by: JP Simard <[email protected]>
Description: create a class to manage Envoy lifecycle and allow for graceful teardown. Teardown is now graceful for both iOS and Android. Envoy's codebase expects to have shutdown happen from the main thread. However, in mobile the thread the engine runs in is not the main thread of the process, and thus shutdown/destructors were not getting run from the thread the engine was started on. This PR makes sure that the engine is destructed/shutdown from the thread that it ran in. Risk Level: high. This PR changes state management for the engine, and initializes it on a std::thread instead of a platform thread. iOS thread management is afaict handled gracefully. On Android the native thread has to be attached and detached from the JVM. This PR attaches the native thread, but the work to detach is a bit involved so will come in a subsequent PR. Testing: local device testing. Fixes #492 Co-authored-by: Jose Nino <[email protected]> Signed-off-by: Mike Schore <[email protected]> Signed-off-by: JP Simard <[email protected]>
Description: this is a follow up to #498. This PR introduces `envoy_engine_callbacks`. They are similar in nature to envoy_http_callbacks. The difference being that they are not exposed all the way to consumer level in the library as it is not needed right now. However, one can see how by adding a type erased context pointer, and following the platform patterns for http callbacks we could thread this all the way up if need be. The immediate need for these callbacks is to detach the engine's native thread from the JVM on Android. Risk Level: med -- adds complexity to engine management. Testing: local testing on devices (Lyft and example app on iOS and Android). In conjunction with #498 this PR Fixes #492 #445 Signed-off-by: Jose Nino <[email protected]> Signed-off-by: JP Simard <[email protected]>
Description: #498 did not fully solve #492. The reset cleanly destructed all objects. However, because destruction was posted in to the event dispatcher, the event dispatcher was left with bad accesses. This PR fixes the issue by issuing shutdown on the dispatcher, and only destructing once the event loop has exited and control has returned to the Engine's run function. Risk Level: med - fixing crash on shutdown Testing: local Fixes #492 Signed-off-by: Jose Nino <[email protected]> Signed-off-by: JP Simard <[email protected]>
Edit: (@alyssawilk on behalf of @cmluciano)
Design doc:
https://docs.google.com/document/d/1G9IVq7F7Onwinsl6EYzGsdzAGvVbo2FGfcPt35ItIx8)
Roadmap
Original top level comment (@rshriram)
Just like TCP proxying, it would be great if Envoy had support UDP proxying as well.
The current code for TCP proxying is pretty generic for most part. The flow is something like this:
on_connection_received_callback()
-->pick upstream and connect to it
on_data_received_callback(data)
-->write_to_upstream(data)
on_stream_reset_callback() [downstream reset or upstream reset?]
-->cleanups
Based on a cursory scan through the code, there is also a timer that cleans up connections beyond a certain period of inactivity ( @mattklein123 please confirm).
In terms of UDP support, much of the code in filters above can be repurposed or renamed to be generic to TCP/UDP where possible.
The ClientConnectionImpl class hardcodes the socket type to be Stream. This needs to be changed.
UDP packets with source port 0 should be dropped(?)
Instead of creating/destroying UDP connection objects per packet, the process can be optimized by having a keepalive style timer that deletes the connection objects after timer expiry. UDP datagram size can be fixed to one MTU or less, as a first order approximation, that should (RFC 791, RFC 2460). We do not need to buffer up data and send it out. WDYT?
In terms of session affinity, packets from same src port, src ip would go to same dst port, dst ip.
The text was updated successfully, but these errors were encountered: