Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rcmgr: connection-level memory base limits are too small for performant QUIC transfers #1706

Closed
Tracked by #1690
marten-seemann opened this issue Aug 15, 2022 · 6 comments · Fixed by #1740
Closed
Tracked by #1690

Comments

@marten-seemann
Copy link
Contributor

Our connection-level memory base limit is 1 MB: https://github.com/libp2p/go-libp2p-resource-manager/blob/9bb1bbdc782e0f7c43768e1946e659432edbc6b0/limit_defaults.go#L507-L513

quic-go starts with a stream flow control window of 512 kB and a connection flow control window of 768 kB: https://github.com/lucas-clemente/quic-go/blob/8c0c481da1644f9934df399c50649d65967d7f22/internal/protocol/params.go#L24-L28

This means that with the base limit, we won't be allowed to ever increase the flow control limit of a QUIC connection. Throughput is then limited by flow control.

How bad is this?

I ran some benchmark between two servers (US East to Europe) located about 80ms RTT apart. With TCP, I was able to achieve transfer speeds of roughly 33 MB/s, while QUIC was only able to achieve 6 MB/s.

The situation gets better for larger servers, as limit autoscaling will allocate more memory to the connection. I don't think QUIC connections being significantly slower than TCP connections is tenable on any size of machine though.

Problem Analysis

The problem here is the following:

  1. We allow 128 connections in the base configuration, and it's questionable if we can go a lot lower than this value if we want to keep libp2p usable.
  2. Our system memory limit is 128 MB, so we can't really go beyond 1 MB per connection, otherwise we might use all of our allowed memory for the QUIC transport, rendering the entire libp2p stack useless, as it won't be able to allocate even a small buffer for message parsing.
  3. It is quite easy for a malicious node to allocate its entire flow control window: All it has to do is not send the first byte (preventing us to read from the stream), and then fill up the flow control window with data.

Possible Solution

I have the feeling that this brings us back to #1708: Memory management is really only useful for muxers on the libp2p layer.
If we limit memory accounting to muxers (the only real consumers of memory using the rcmgr), and introduce a dedicated muxer memory pool, this would give us 128 MB for QUIC connections. Each connection would start with 768 kB of connection flow control limit, consuming 96 MB of memory (worst case). With the remaining 32 MB some QUIC connections (the ones that are currently getting used for high BDP transfers) would be able to increase their flow control windows.
We might also be able to teach quic-go to decrease flow control windows once utilization of the connection goes down, so that a long-living connection that transferred a lot of data an hour ago doesn't keep occupying a lot of data.

cc @vyzo @MarcoPolo @Stebalien

@vyzo
Copy link
Contributor

vyzo commented Aug 15, 2022

Looks like autoscaling is doing it wrong; at the very least it should allow the full 16M per conn, but with the sum of conns averaging at 1M.
That solves half the problem.

The other half is returning memory from the transport or oversubscribing with some kill semantics.

The attack you describe is so primitive there are many ways to counter; a simple timeout will do. What does google do in their servers? I would be very surprised if you could just OOM them by opening 1M conns.

@marten-seemann
Copy link
Contributor Author

The attack you describe is so primitive there are many ways to counter; a simple timeout will do. What does google do in their servers?

There are many variations of this attack. Essentially, you’re in slow-loris territory here, which it’s notoriously difficult to defend against in the general case. Google doesn’t care about DoS attacks, they’ll just load-balance to more machines (Google doesn’t even care about DoS attacks in their code that let you take out one of their servers by sending 100 kb of data. I reported such vulnerabilities to them multiple times, and they didn’t even consider me for their bug bounty program. Won’t do again.).

Looks like autoscaling is doing it wrong; at the very least it should allow the full 16M per conn, but with the sum of conns averaging at 1M.

That would mean starting with a very small receive buffer, which would be bad for performance.
Even if it was feasible, the rcmgr wouldn’t allow us to implement such a logic, as we don’t have a muxer scope.

@vyzo
Copy link
Contributor

vyzo commented Aug 15, 2022

You misunderstand, we can set the connection limit to 16M, but the total aggregate to less than conns x 16M (say 1M pet conn).
The previous limit logic already did that, and its a form of oversubscription.

To make things maximally effective, we need ways to return memory.

@marten-seemann
Copy link
Contributor Author

You misunderstand, we can set the connection limit to 16M, but the total aggregate to less than conns x 16M (say 1M pet conn).
The previous limit logic already did that, and its a form of oversubscription.

How would that prevent the QUIC transport / the muxer from consuming all the memory in the system scope?

@vyzo
Copy link
Contributor

vyzo commented Aug 15, 2022

actually thats the peer scope limit, conn memory not relevant.
Either way, point stands; if autoscaling evenly divides, it is doing it wrong.

@vyzo
Copy link
Contributor

vyzo commented Aug 15, 2022

It can only consume up to 16M per peer.
Couple that with aggregation and reservation priority, and if you have some peers consuming all their memory, subsequent peers will get smaller chunks.

What is important here, is finding ways to return memory/shrink windows/buffers.

@marten-seemann marten-seemann transferred this issue from libp2p/go-libp2p-resource-manager Aug 19, 2022
@marten-seemann marten-seemann changed the title connection-level memory base limits are too small for performant QUIC transfers rcmgr: connection-level memory base limits are too small for performant QUIC transfers Aug 19, 2022
@marten-seemann marten-seemann mentioned this issue Aug 23, 2022
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants