-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for terminating QUIC #2557
Comments
How close of a match is the quic threading model to envoy's? Any known issues or expected pain points integrating google-quic into envoy? |
@alyssawilk Thanks for the great summary of the plan! Small ask: we are already tracking QUIC here: #1193. Can we consolidate issues? |
@ggreenway Threading is fine - the was written for GFE (Google's HTTP proxy) and Chrome, which have compatible threading models with Envoy. I think I covered the main pain points above. Pain point one being that the code isn't architected well for Envoy APIs (which we'll fix). And pain point two being "external deps" - while we tried to abstract away non-QUIC-core concepts like "how do you implement your IP Address" and "how are alarms managed" into the platforms directory, we really only abstracted things that GFE/Chrome didn't have in common. Given they're both predominantly Google code, I suspect we'll run into many things we haven't yet abstracted enough (like logging). Once we get it building I think the APIs will come fairly quickly, and the really tricky parts of the code (crypto and congestion) shouldn't be a problem. Other than that, well almost all the QUIC code is open sourced, but not all. I think our UDP proxying is GFE-specific enough we may want to reimplement from scratch, though we may have some useful utils and definitely will have input on how things can go wrong :-P. We'll also need to open source a bunch of the perf optimizations the current toy server didn't need, like reading from rx_ring with berkley packet filters and writing to raw sockets so perf won't suck for bandwidth-heavy users. |
@mattklein123 your call. I liken the two to "TCP proxying" and "HTTP termination" so I think they're different enough to warrant different tracking bugs. I also think they can be done substantially in parallel - I'm hoping @cmluciano will actually pick up UDP proxying while mpw@google/@juvexp will probably do the bulk of this one. |
@alyssawilk if you think the issues are different no problem we can keep both. |
Sorry for the lack of updates. I worked on this only a little bit since the bug was filed:
I plan to get back to the QUIC platform implementation in the next week, hopefully I can get it done by the end of this quarter. I'll send another update after that. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions. |
@juvexp Any updates on your progress? |
Will this work involve supporting Google QUIC or IETF QUIC? Since QUIC is being standardized by the IETF, I think it's more beneficial to spend time on IETF QUIC. |
I agree, we should be doing IETF QUIC. |
The plan is to support both Google QUIC and IETF QUIC. Google QUIC is the fastest path to something that is production ready while continuing to work towards the IETF standard. |
Hi.. is someone working on bringing QUIC support to envoy? I saw a "help wanted" tag on #1193 as well.. Just want to understand the state of things. |
AFIK the Google-QUIC team that had picked this up got diverted from working on it, but are hoping to get back to it and land QUIC-Envoy integration in Q1. That said I wouldn't say that's anywhere near guaranteed - if you have interest or cycles in helping out help would definitely be appreciated :-) |
Thanks for the prompt response.. Is there any POC from Google's side that can give us more context? For e.g, if this needs changes to Chromium. (we contributed a small reverse proxy to chromium quic and had to move around some classes to make way) Definitely love to contribute, once we flesh out any dependencies :) |
Well I'm the one who generally pings the QUIC team when there's questions but @ianswett @RyanatGoogle who are actually deciding who does what. For contributions, I think we're going to want to reuse the google-QUIC code where we can for congestion control and crypto because (there's so many subtle things to get wrong and it'd be nice to not repeat mistakes!) but AFIK no one ever got around to open sourcing the internal proxy implementation, so if you're interested in doing QUIC work that'd be a great place to start. I'd be happy to chat about how that work might go when/if you have cycles as I have quite a bit of context. |
As @alyssawilk alluded to, we're currently working to take Google's QUIC code which is shared between Chromium and our internal proxy and export it to a stand-alone repository where it should be easier to consume for other projects. This repository will be DEPS'd into chromium (meaning we'll pull the contents of this repo directly instead of having duplicate or slightly modified files in Chromium). Consumer of this stand alone repo (of which Chromium will be one) will need to provide their own "platform impl" for the various platform specific QUIC dependencies. https://cs.chromium.org/chromium/src/net/third_party/quic/platform/impl/ This includes things like socket and clock abstractions, various compiler configurations, logging, etc. It will not require consumers to pull in all sorts of chromium dependencies. Hope that helps... |
Has anyone looked at https://github.com/ngtcp2/ngtcp2 ? Their event loop model fits envoy so might be a natural fit. |
@alyssawilk Thanks. agree on reusing chromium impl as much as possible for the same reasons. On flip side, something like ngtcp2 is very promising but not sure how production hardened it is, at this point. Given envoy is used by so many, it may be a lil premature to pull it into envoy and certify support. @RyanatGoogle The standalone repo makes total sense.. I remember you moved a lot of code already for our chromium proxy work. Is there more to be done there? In other words, is it just moving it to a new repo or there are more refactoring work that needs to be carefully orchestrated by google engineers to ensure the internal proxy continues to work? If latter, then it may not be easy for us (outside of google) to get started on this work? Thoughts? |
How do you compare "production ready" between refactored(untested) code vs something like ngtcp2? |
@vinothchandar The primary work remaining is to remove a whole pile of minor diffs between the internal and external versions of the code which have crept in over the years. This is a raft of CLs like: https://chromium-review.googlesource.com/c/chromium/src/+/1273814 Once that's complete it's mostly a simple matter of just moving the files to a new repo. We do not believe there is any careful refactoring to be done as part of this effort like there was when trying to add reverse proxy support to the toy server. @conqer I'm not sure if your question was about the Google QUIC repo or something else. If it was about the Google QUIC repo, that will contain the core QUIC implementation which is used for Chrome (and various Google apps which use Chrome's networking library) as well as our internal proxy. So this is very much tested production code. |
@conqer @vinothchandar I'm out for two weeks but sorting out a firm plan for QUIC is high on my priority list when I get back. If you would to be involved please send me email at From my perspective, both https://github.com/ngtcp2/ngtcp2 and https://github.com/h2o/quicly are on the table is viable options (nothing has been firmly decided yet). Everything will be hidden behind an interface no matter what, so Google's code can be swapped in if it can't be extracted into a library in the time that we want to get this done. As a pre-requisite to this work I'm probably also going to start helping out on basic UDP proxy support since no progress has been made on that. (#492) |
@mattklein123 sent you email. |
@RyanatGoogle that sounds promising.. We can try our hand at this then, looks like. If we atleast have a PoC version with Chromium working, then may be as bad to redo it with the new repo/independent lib. Correct me if I am missing something. @mattklein123 yes will send you an email to coordinate. It'd be awesome if you can also get @alyssawilk looped in. I have reasonable handle on the chromium code base for this, but total newbie to envoy. So appreciate all your help. |
Commit Message: allow HCM to config Http3Options and use it with other HCM configs, i.e. max_request_headers_kb and headers_with_underscores_action, to setup QuicHttpServerConnectionImpl. And support these configurations in QUIC. Currently the only Http3Options configuration is override_stream_error_on_invalid_http_message. Additional Description: added Http3 codec stats. Pass it along with Http3Options and other HCM configs. Risk Level: low Testing: enable related tests in quic_protocol_integration_test Docs Changes: updated docs/root/configuration/http/http_conn_man/response_code_details.rst Part of #12930 #2557 Signed-off-by: Dan Zhang <[email protected]>
Commit Message: use max header size in HCM config to initialize quic session. Additional Description: change QuicSession::Initialize() and filter chain creation order to set max header list size for each session before Initialize(). Move Initialize() of Http3 connection from CodecClient to CodecClientProd. Added CodecClient::connect() interface. Change EnvoyQuicConnection not to directly inherit from QuicConnection. And renamed it. Fix a use-after-free bug in FakeUpstream which causes ASAN failure. Risk Level: low Testing: more protocol H3 tests Part of #2557 #12930 #14829 Signed-off-by: Dan Zhang <[email protected]>
#16128) Commit Message: use posted callback to block/unblock quic stream in EnvoyQuicStream::readDisable() with weak_ptr to stream to handle stream life time issue. Additional Description: currently if readDisabled() is called in decodeHeaders|Trailers(), the stream will be blocked right away which breaks the assumption QUICHE has that a stream shouldn't be blocked during OnInitialHeadersComplete() and OnTrailingHeadersComplete(). This change makes the stream state change completely outside of QUICHE call stack. (The unblocking is already outside of QUICHE call stack in existing implementation.) Also simplify the blockage state change logic. Risk Level: low Testing: added more unit tests and enabled Http2UpstreamIntegrationTest::ManyLargeSimultaneousRequestWithBufferLimits which was flaky. Part of #2557 #14829 Signed-off-by: Dan Zhang <[email protected]> Co-authored-by: Dan Zhang <[email protected]>
Commit Message: Add initial stream and connection level flow control windows to envoy.config.core.v3.QuicProtocolOptions which is be used in QUIC listener config and Http3 upstream cluster config. Risk Level: low Testing: re-enable more Http3 downstream protocol test. Part of #2557 #12930 #14829 Signed-off-by: Dan Zhang <[email protected]> Co-authored-by: Dan Zhang <[email protected]>
Commit Message: Add initial stream and connection level flow control windows to envoy.config.core.v3.QuicProtocolOptions which is be used in QUIC listener config and Http3 upstream cluster config. Risk Level: low Testing: re-enable more Http3 downstream protocol test. Part of envoyproxy#2557 envoyproxy#12930 envoyproxy#14829 Signed-off-by: Dan Zhang <[email protected]> Co-authored-by: Dan Zhang <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
Commit Message: use max header size in HCM config to initialize quic session. Additional Description: change QuicSession::Initialize() and filter chain creation order to set max header list size for each session before Initialize(). Move Initialize() of Http3 connection from CodecClient to CodecClientProd. Added CodecClient::connect() interface. Change EnvoyQuicConnection not to directly inherit from QuicConnection. And renamed it. Fix a use-after-free bug in FakeUpstream which causes ASAN failure. Risk Level: low Testing: more protocol H3 tests Part of envoyproxy#2557 envoyproxy#12930 envoyproxy#14829 Signed-off-by: Dan Zhang <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
envoyproxy#16128) Commit Message: use posted callback to block/unblock quic stream in EnvoyQuicStream::readDisable() with weak_ptr to stream to handle stream life time issue. Additional Description: currently if readDisabled() is called in decodeHeaders|Trailers(), the stream will be blocked right away which breaks the assumption QUICHE has that a stream shouldn't be blocked during OnInitialHeadersComplete() and OnTrailingHeadersComplete(). This change makes the stream state change completely outside of QUICHE call stack. (The unblocking is already outside of QUICHE call stack in existing implementation.) Also simplify the blockage state change logic. Risk Level: low Testing: added more unit tests and enabled Http2UpstreamIntegrationTest::ManyLargeSimultaneousRequestWithBufferLimits which was flaky. Part of envoyproxy#2557 envoyproxy#14829 Signed-off-by: Dan Zhang <[email protected]> Co-authored-by: Dan Zhang <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
Commit Message: Add initial stream and connection level flow control windows to envoy.config.core.v3.QuicProtocolOptions which is be used in QUIC listener config and Http3 upstream cluster config. Risk Level: low Testing: re-enable more Http3 downstream protocol test. Part of envoyproxy#2557 envoyproxy#12930 envoyproxy#14829 Signed-off-by: Dan Zhang <[email protected]> Co-authored-by: Dan Zhang <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
Commit Message: use max header size in HCM config to initialize quic session. Additional Description: change QuicSession::Initialize() and filter chain creation order to set max header list size for each session before Initialize(). Move Initialize() of Http3 connection from CodecClient to CodecClientProd. Added CodecClient::connect() interface. Change EnvoyQuicConnection not to directly inherit from QuicConnection. And renamed it. Fix a use-after-free bug in FakeUpstream which causes ASAN failure. Risk Level: low Testing: more protocol H3 tests Part of envoyproxy#2557 envoyproxy#12930 envoyproxy#14829 Signed-off-by: Dan Zhang <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
envoyproxy#16128) Commit Message: use posted callback to block/unblock quic stream in EnvoyQuicStream::readDisable() with weak_ptr to stream to handle stream life time issue. Additional Description: currently if readDisabled() is called in decodeHeaders|Trailers(), the stream will be blocked right away which breaks the assumption QUICHE has that a stream shouldn't be blocked during OnInitialHeadersComplete() and OnTrailingHeadersComplete(). This change makes the stream state change completely outside of QUICHE call stack. (The unblocking is already outside of QUICHE call stack in existing implementation.) Also simplify the blockage state change logic. Risk Level: low Testing: added more unit tests and enabled Http2UpstreamIntegrationTest::ManyLargeSimultaneousRequestWithBufferLimits which was flaky. Part of envoyproxy#2557 envoyproxy#14829 Signed-off-by: Dan Zhang <[email protected]> Co-authored-by: Dan Zhang <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
Commit Message: Add initial stream and connection level flow control windows to envoy.config.core.v3.QuicProtocolOptions which is be used in QUIC listener config and Http3 upstream cluster config. Risk Level: low Testing: re-enable more Http3 downstream protocol test. Part of envoyproxy#2557 envoyproxy#12930 envoyproxy#14829 Signed-off-by: Dan Zhang <[email protected]> Co-authored-by: Dan Zhang <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
Unhiding HTTP/3 upstream and downstream configuration, linking to example configs, and updating docs for HTTP/3 alpha. Risk Level: n/a Testing: n/a Docs Changes: yes Release Notes: inline #14829 #2557 #15845 Fixes #12923 Co-Authored-By: Michael Payne [email protected] Signed-off-by: Alyssa Wilk <[email protected]>
Given QUIC alpha, I'm going to close this off as done - folks can follow along the mvp list |
Commit Message: make quic proof source and crypto streams extensions. Add config for default ones. If not specified in config, the default ones will be used. Risk Level: low Testing: existing tests passed Part of #2557 Co-authored-by: Dan Zhang <[email protected]>
…y#16658) Commit Message: make quic proof source and crypto streams extensions. Add config for default ones. If not specified in config, the default ones will be used. Risk Level: low Testing: existing tests passed Part of envoyproxy#2557 Co-authored-by: Dan Zhang <[email protected]>
Automated changes by [create-pull-request](https://github.com/peter-evans/create-pull-request) GitHub action Signed-off-by: JP Simard <[email protected]>
Automated changes by [create-pull-request](https://github.com/peter-evans/create-pull-request) GitHub action Signed-off-by: JP Simard <[email protected]>
We’ve had a few off-line discussions on how to maybe go about this with @mattklein123 but we’re now close enough it’s worth filing a tracking issue for it :-)
Here’s our plan A - any Envoy devs interested in QUIC support are encouraged to offer suggestions/improvements
Milestone 1 is to hack together “something which builds” using the existing Google QUIC code*, because as it turns out it takes years to debug all the weird corner cases in congestion control and crypto and it’s likely easier to copy existing code than write it all from scratch. That code base is “supposed to” build fairly cleanly as long as one implements all the things in quic/platform/impl but almost certainly doesn’t. This will be done in in @juvexp’s Envoy branch to allow rapid iteration - any interested parties are welcome to follow along and/or contribute there. Once the QUIC code builds and QUIC unit tests pass, it’s on to Milestone 2.
Milestone 2 is to get QUIC “working” with Envoy, where we have working integration tests. This may not include clean Envoy API use. For example, the first pass might treat QUIC as one logical Connection rather than one-Connection-per-stream Milestone 2 will likely involve landing some code in Envoy (things like UDP listeners if @cmluciano hasn’t beat us to it)
Milestone 3 is having QUIC fully landed in Envoy, with proper API wrappers for all the various codec / connection / crypto functionality QUIC supports. This will involve slowly cleaning up anything still only in @juvexp’s Envoy branch, along with having a story for gracefully handling upstream/downstream updates and contributions.
*currently visibile at https://github.com/chromium/chromium/tree/master/net/quic. Plot twist, while we’re hacking around in a custom branch the Google devs are likely to use “upstream” QUIC code, since then they can make changes to code structure and push them directly to Envoy rather than pushing from google3-upstream to Chrome to Envoy which will really slow development. Long term code syncing is a big glaring TBD as we need a library non-Googlers can contribute to, but also want to cheaply and easily get the latest security fixes and IETF spec updates from the dedicated Google dev team working on QUIC.
The text was updated successfully, but these errors were encountered: