-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for QUIC support #536
Comments
Yes. QUIC would improve our networking significantly. :) We previously got hung up on keeping Noise IK via nQUIC = QUIC + Noise vs adopting QUIC with TLS 1.3 by defining some new grandpa TLS certificate for validators, crazier certificates for collators, and maybe others. I'm completely happy outsourcing an initial spec for one or both of these. It's also fine if we do something fast and warn everyone to expect yet another network protocol migration. It's conversely also possible that @tomaka knows some wish-like of libp2p messes that would be good to bundle in with this big a migration. I'd expect a "grandpa TLS certificate" for validators to consist of a Merkle proof identifying the Ed25519 grandpa session keys, and then a certificate by the session key on the transport key. We should first migrate storage from our radix 16 hashing, which bloats in a dense Merkle tree, to radix 2 hashing with hash fast forwarding (and radix 16 caching). We'll want crazy certificates for collators, like VRFs for parachains running Babe and worse for Sassafras in future. I've no idea if we properly exploit Noise IK yet anyways, meaning whether nodes initiating connections identify themselves first via something vaguely like this "grandpa TLS certificate". It's annoying if the Merkle proof that shows your actually a validator does not fit into the MTU used for the first datagram, so that 4x space cost of our radix 16 hashing matters here too. In brief, we need to decide how quickly we want this and what we want to try to rush to merge into such a migration. We'll surely punt on doing authentication perfectly optimal initially, but we're slowly creating the possibility for better authentication elsewhere so it'll circle back here one day. :) |
I think the simplest would be QUIC with TLS 1.3 and no authentication, but authentication done late inside the stream. We simultaneously give someone reputable a grant for designing nQUIC properly. And push our storage folk to fix our radix 16 hashing issue. |
I'd strongly prefer for us to implement something that is already specified (i.e. TLS+QUIC) and not start creating |
So far I did a research and implemented:
The reason I did it was I wanted to be compatible with the go implementation of libp2p-quic. In the future I want to test our client with their server and vice versa. https://gist.github.com/kpp/c9c84411e17f4b27dddf0d438b289862 The code is ugly but it's not the point. The point is that the rust tool is binary compatible. The next step is to prettify the code, implement a certificate serializer and create a PR. Also I found a tool in the Internets to read DER files which helped me a lot to inspect the certificate: https://lapo.it/asn1js. |
So far I:
|
These two weeks I:
|
While I was playing with Since the last time I worked on multiple issues:
The current progress is: we are integrating libp2p-quic into rust-libp2p: libp2p/rust-libp2p#2159. |
dvc94ch dropped, so I opened my own PR: libp2p/rust-libp2p#2289
Here are bench results: Bench results
Also we still cannot connect to go-libp2p-quic. I am working on it. |
It's QUIC with QUIC's standard TLS 1.3 then? Cool we reinvent enough around here anyways. ;) |
I discovered that I pollute Swarm threads with StreamMuxer::poll_event with:
And they are responsible for parsing packets and building packets respectively. I believe that's the main issue for poor performance. |
The current state of this issue:
|
|
Any idea what the performance looks like? |
I am working on it. |
libp2p/rust-libp2p#2289 is merged. |
localhost: QUIC with libp2p-perf
localhost: TCP with libp2p-perf
a realworld bench on 2 VMs
|
@kpp are those benchmarks for the now merged implementation (libp2p/rust-libp2p#2289) or for libp2p/rust-libp2p#2801? |
Interesting, so quite a dramatic loss of performance for now. |
Much faster than the go stack on localhost, though. How much latency is there between the two VMs, and what was the instantaneous throughput like at the end of the test? The discrepancy between localhost and "real world" is interesting, and hints at significant room for improvement in our congestion controller, which is known to have a few significant TODOs still (e.g. HyStart for Cubic). Which congestion controller did you use? Did you try the experimental BBR impl? |
I've heard claims the congestion controller winds up being some application specific black magic that Google never explains, but maybe you guys understand better the underlying concerns there? |
I'm not sure what you mean. Quinn implements three different congestion controllers (New Reno, Cubic, and BBRv1), all of which are fairly well documented in the literature. The underlying logic is very similar to that used in the Linux kernel TCP implementation, but there's a few optimizations to Cubic that we haven't implemented yet. |
@elenaf9 these are for libp2p/rust-libp2p#2801. I will re-do it for the master branch too. |
libp2p/rust-libp2p#3454 is merged |
QUIC is added to rust-libp2p: libp2p/rust-libp2p#2883 (comment) |
I'd assume the benchmarks remain pretty lackluster? |
@burdges the latest Validated via https://github.com/libp2p/test-plans/tree/master/perf Expect more metrics to come. But at this point, I don't see a reason why to be pessimistic about |
I believe they're thinking of the numbers at #536 (comment), which suggest overly pessimistic congestion controller behavior when there's significant latency, which tends to not be reflected by loopback tests. The details I asked for in #536 (comment) could help drive further investigation. Of course, those older numbers were also worse than TCP, so maybe they're just obsolete. |
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
* lower limit for message weight * fmt * do not include tx overhead in weights returned by weight_limits_of_message_on_bridged_chain * Use correct chain in comment Co-authored-by: Hernando Castano <[email protected]>
There are a couple of reasons to support QUIC network protocol.
The first, slightly minor, is that validators are maintaining more than 1000 TCP connections open and are running into file descriptor limits.
Second is we use Yamux for multiplexing multiple substreams into one, which is a rather poor protocol that does the best it can on top of TCP, but can't do more.
Because we can't know the TCP window size from user space, we can't actually properly allocate window sizes to individual substreams. If two or more substreams send a big volume of data, we'll run into head of line blocking issues (This is the same reason why HTTP2 is not so great and is being replaced with HTTP3, if you want to read more)
Third reason is we're suffering from a TCP slow start issue. A TCP connection is almost unused, and suddenly we send for example 2 MiB on it. Because of TCP slow start, it's going to take a rather long time to send these 2 MiBs even if the connection would be capable of sending them quickly.
First attempt was made in paritytech/substrate#6366.
So far there is https://crates.io/crates/libp2p-quic which should be reviewed.
The list of sub issues will be updated time to time:
x509-signature
withpicky
. The answer is we can replace it with any crate we can includingx509-parser
,picky
, etc...libp2p-quic
now supports TLS, but there still some issues left.libp2p-quic
intorust-libp2p
with the help of its maintainersLibp2p quic second attempt libp2p/rust-libp2p#2159transports/quic: Add implementation based onquinn-proto
libp2p/rust-libp2p#2289libp2p-quic
intosubstrate
Add experimental support for QUIC substrate#11514.The text was updated successfully, but these errors were encountered: