-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Zero allocation connectionless sockets #30797
Comments
If the core issue is repeated conversions from EndPoint to SocketAddress, why not cache it on the EndPoint instead? And aren't some of the allocations you mention a side-effect of what used to be one implementation being split between multiple assemblies? If that split is resulting in a perf impact, we should consider moving types into the same assembly to avoid the overhead, rather than adding new APIs. Also, we separately want ValueTask/Memory-based APIs for UDP. Are you proposing that these be them? Would we not also have EndPoint-based APIs? This feels to me like it's breaking an abstraction that shouldn't be broken, and we should have exhausted every other avenue before doing so. |
Yes, I considered this option. We've been careful to make defensive copies when using
Agreed, this is one of my open questions. I don't really see a downside to merging at least System.Net.Primitives and System.Net.Sockets, but I don't have the background on why they were split in the first place. If you have any knowledge here it would be helpful. Doing so would help us cut out many of these allocations.
No, that is not part of this proposal. We should of course make both proposals consistent if it moves forward.
Agreed, this is one of my open questions. Hoping to see some collaboration on those potential avenues. |
Related:
|
User may not care about port but you alway have to specify one for sending. socket.SendTo(buffer, 0, buffer.Length, SocketFlags.None, remoteAddress); will not work unless you use Connect() before. And when you do, you do not need the remoteAddress/remoteEndPoint at all. Did I miss something? |
I think there are two historical things going on here. @stephentoub may have more context. (1) These APIs were originally designed a long time ago and at the time, minimizing allocations was not a major priority, unfortunately. We can and should fix these issues. In other words, there's some bad legacy here we can improve on, I think. |
It seems to me that we need to consider the send case and the receive case separately here. On the send side, there's no reason we need to do any copies at all. All we need to do is produce the native sockaddr bytes when we do the native call. Currently, that means we need to call IPEndPoint.Serialize which will allocate a SocketAddress etc. But if we just had something like IPEndPoint.Serialize(Span sockAddrBuffer) then I think we could avoid allocation in this path entirely without changing the top-level API. I'm glossing over details here so please correct me if this is more complicated than I'm describing On the receive side, we ultimately need to return the endpoint info somehow. In your proposal we're mutating an existing SocketAddress, so you are correct that this API does not allocate; but presumably every user of this API will convert the SocketAddress into an IPEndPoint, which will cause allocation at that point, right? I'm not sure what to do about this, but if the goal is truly zero allocation then we need to consider the common usage, not just the API itself. |
@wfurt |
@geoffkizer I don't think this needs to be a common use. Really you'd use the address to lookup a client's state in a dictionary -- you don't need an |
ok. that would make sense. I look at .Net API and I did not see port there. Would you envision adding Port property to it as well? |
Note it doesn't have an IP property either. My vision is to keep |
@geoffkizer yea, I thought of that. I think keeping the APIs symmetrical is worthwhile, though. Plus, there's some (probably ultimately inconsequential, but...) benefit to avoiding a |
A couple related questions that I don't quite understand: (1) Why does SocketAddress even exist today? Are there any APIs that take it? Is it just so a user can get the bytes for the native sockaddr structure? When would they even need to do this? |
Yeah, that's a good point.
I agree but I'm not quite sure the best way to do this. |
I think it is to allow custom implementations of |
@scalablecory Yeah that makes sense. |
Looks like it's to work within
|
Ok, but you already had to bind the socket to a LocalEndPoint, right? So why can't you figure that out from the LocalEndPoint? |
Good point. I can't account for that. Perhaps just an oversight in the design, or |
Has it been measured what performance impact these allocations have? I wonder if it matters with real network communication, even just from a CPU usage standpoint. |
@GSPP unfortunately we only have synthetic benchmarks to look at right now. Having good real-world benchmarks for Sockets is a long-term goal. Realistically, high-frequency protocols like QUIC, uTP, and latency-sensitive games (Unity engine) will see excessive gen-0 GC. |
@GSPP Here https://github.com/dotnet/corefx/issues/39317 I said that in our game server 40% of our allocations are from from UDP Socket. It is a disaster for us because we use GC.TryStartNoGCRegion and maintenance mode. |
@scalablecory I do think this is the right general approach. That is, have APIs that take/return a "sockaddr"-like argument. I think my main concern here is that the way SocketAddress works is a little weird. One example (that you pointed out) is it's not obvious that ReceiveFrom would fill in the provided SocketAddress. But I'm not sure what to do about this. |
One other thought. It seems like there's some low-hanging fruit here in terms of allocation that we could improve without any new actual API. This would help issues like #30196. Seems like we should try to make this better first before we introduce new API; that will provide benefit to existing customers. |
@geoffkizer thanks, I appreciate the thought you're putting into this.
Yea I'm not sure either. I wonder if we have any similar APIs we can lean on for inspiration... I'll have to go looking. @bartonjs can you lend us your API design muscle? If we have these two APIs, can you think of a clean way to help us distinguish between public int ReceiveFrom(Span<byte> buffer, SocketFlags socketFlags, SocketAddress socketAddress);
public int SendTo(ReadOnlySpan<byte> buffer, SocketFlags socketFlags, SocketAddress socketAddress);
I've created dotnet/corefx#41039 to merge Primitives and Sockets, which would allow us to get rid of most of these allocations without an API change (and without using the Lets decide on that one before moving forward with this one. |
Can you help me understand why merging the assemblies helps? |
Just looking at the SendTo code.... (1) We always make a "snapshot", i.e. copy, of the IPEndPoint -- which means a copy of the IPAddress as well. We only actually use the snapshot when _rightEndPoint is null, which I believe is uncommon. (I'm actually kind of unclear why we need _rightEndPoint at all here, but let's assume we do.) So at the very least we could just avoid the copy unless _rightEndPoint is null. (There's also some weirdness around dual mode that I'm going to ignore for now.) So I think for SendTo at least, we could get to 0 allocation without any API changes at all. ReceiveFrom is harder, and as the API works it will still require at least an allocation for the returned IPEndPoint and IPAddress. But it seems like we could do better there too. |
I gotta say, I really don't understand why we set _rightEndPoint in SendTo. The main purpose of _rightEndPoint seems to be to store the local endpoint. But in SendTo, we're setting it to the remote endpoint. This seems really weird. |
Thats what I meant, using a native wrapper is of course not an ideal solution, needing to compile the native dll for each platform separately is very annoying. But for the performance it is worth it at the moment. |
Just to be clear with my "quick win" comment, I'm referring to @antonfirsov's comment of work load available for changes in .NET 8. I'd rather even some improvements that we can get in this coming release than just waiting till some release in the future where all the changes required can be tackled at once. Whatever changes that can be made to .NET 8 given the workload might not be enough to get you to switch to using the managed implementation but it will benefit everybody already using it. When future releases contain more improvements, it may still get to a point where you actually can with minimal or no performance degradation. Really, all I'm talking about is time to get some changes in now and some more later rather than getting them all at once.
I am kinda curious how much better throughput/lower latency you're actually finding the native sockets over the .NET implementation. Is it like 2x faster or 10x faster? More? This is relating back to @antonfirsov's comment, if the managed implementation was 2x faster/half the allocations than it is currently, does that even remotely close the gap? |
As much as I also agree that getting any quick wins as soon as possible is a great idea, I think some people are discounting the time and effort it will take for some people to switch from their current set of wrappers back to the .NET APIs. If the custom wrapper was written with a completely different API style/approach than the .NET version, obviously it is not a drop in replacement, nor will it be a "simple refactor" . |
Well, I misunderstood you, and I apologize for that. |
https://github.com/enclave-networks/Enclave.FastPacket here is some repositories maybe little help |
I have PoC for the class Socket
{
public int ReceiveFrom(Span<byte> buffer, SocketFlags socketFlags, SocketAddress socketAddress);
public ValueTask<int> ReceiveFromAsync(Memory<byte> buffer, SocketFlags socketFlags, SocketAddress socketAddress, CancellationToken cancellationToken = default);
public int SendTo(ReadOnlySpan<byte> buffer, SocketFlags socketFlags, SocketAddress socketAddress);
public ValueTask<int> SendToAsync(ReadOnlyMemory<byte> buffer, SocketFlags socketFlags, SocketAddress socketAddress, CancellationToken cancellationToken = default);
} The From the conversation above: for the DNS server: The Mirroring is somewhat more complicated: https://github.com/MirrorNetworking/Mirror/blob/d226d577c21e557ae09086e0ddc1da63d2704e38/Assets/Mirror/Transports/KCP/kcp2k/highlevel/KcpServer.cs#L149 If the At the moment I don't have the originally proposed change for At least for projects discussed above this should be sufficient. If this is good enough for majority here I can drive it for API review and push it to 8.0 before the gates close. I opened separate #86872 to provide some more convenient methods around Lastly #86513 talks about cost of pinning. We had case when the |
@wfurt |
@wfurt That's amazing. This will elevate C# to a whole new class of use cases. |
什么时候可以完成这项0GC的工作啊???期待 [EDIT] |
Note that the label is |
👍 great. |
This was long journey but the end is near. #87397 added sync & async overloads with What missing is direct support for All this should be available ion RC1 build -> just barely made it in for 8.0. |
Thank you, it's appreciated. Should I use |
yes, it should have at least what the given socket can receive. Or more generically Socket socket = new Socket(SocketType.Dgram, ProtocolType.Udp)
SocketAddress socketAddress = new SocketAddress(socket.Address.Family); it will default to |
This proposal eliminates allocations for connectionless use of
Socket
. It augments theSocketAddress
class to allow reuse across operations, becoming a high-perf alternative toEndPoint
.Rationale and Usage
APIs which need to translate between
IPEndPoint
and nativesockaddr
structures are performing a large amount of defensive copying and layering workarounds.This affects UDP performance and contributes to excessive GC. A simple example is:
These two calls allocate 12 times:
IPEndPoint
IPAddress
SocketAddress
byte[]
...6x if IPv6See also: #30196
New usage has 0 allocations:
Proposed API
Details
SocketAddress
as a dictionary key to lookup client state, to avoid first converting toEndPoint
.SocketAddress
. This duty continues to be delegated toEndPoint
.SocketAddress
into anEndPoint
, they can already useEndPoint.Create
.SocketAddress
is currently duplicated in System.Net.Primitives and System.Net.Sockets to avoid exposing its internal buffer. This change will allow avoiding duplication.SocketAddress
until the I/O is finished.EndPoint
as we can take defensive copies before methods return.SendTo
andSendToAsync
IPv4/IPv6 with some special casing, so this API would primarily be to optimizeReceiveFrom
variants as well as (less important) allowing non-IPv4/IPv6 protocols to benefit. Still, if we were to add an API forReceiveFrom
we would probably want an API onSendTo
for symmetry.Open Questions
SocketAddress
class in multiple assemblies to avoid exposing its buffer, and have a step to marshal (byte-by-byte) between the two implementations.EndPoint
, it's a nice abstraction that we wanted here despite performance implications.socketAddress
is written to byReceiveFrom
. Is there a better way we can indicate this?SocketAddress
exist purely because System.Net.Sockets needs access to internals in System.Net.Primitives. Any thoughts on how to avoid exposing these "pubternal" bits?ReceiveFrom
without making any API changes. It's not a perfect solution but might be good enough.Related Issues
There are two additional issues to update our APIs with
ValueTask
/Span
/Memory
that this will need to be consistent with:Socket
class add Span overloads for Socket datagram functions #938UdpClient
class UdpClient - add Span support #864The text was updated successfully, but these errors were encountered: