Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single epoll thread per 28 cores #35800

Merged
merged 15 commits into from
May 7, 2020
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1986,7 +1986,8 @@ public SocketError SendFileAsync(SafeFileHandle fileHandle, long offset, long co
// be scheduled instead. It's not functionally incorrect to schedule the release of a synchronous operation, just it may
// lead to thread pool starvation issues if the synchronous operations are blocking thread pool threads (typically not
// advised) and more threads are not immediately available to run work items that would release those operations.
public unsafe Interop.Sys.SocketEvents HandleSyncEventsSpeculatively(Interop.Sys.SocketEvents events)
[MethodImpl(MethodImplOptions.AggressiveInlining)]
stephentoub marked this conversation as resolved.
Show resolved Hide resolved
public Interop.Sys.SocketEvents HandleSyncEventsSpeculatively(Interop.Sys.SocketEvents events)
{
if ((events & Interop.Sys.SocketEvents.Error) != 0)
{
Expand Down Expand Up @@ -2067,4 +2068,16 @@ public static void OutputTrace(string s)

public static string IdOf(object o) => o == null ? "(null)" : $"{o.GetType().Name}#{o.GetHashCode():X2}";
}

// struct wrapper is used in order to improve the performance of the epoll thread hot path by up to 3% of some TechEmpower benchmarks
// the goal is to have a dedicated generic instantiation and using:
// System.Collections.Dictionary`2[System.IntPtr,System.Net.Sockets.SocketAsyncContextWrapper]::TryGetValueInternal(!0,int32,!1&)
// instead of:
// System.Collections.Dictionary`2[System.IntPtr,System.__Canon]::TryGetValueInternal(!0,int32,!1&)
internal readonly struct SocketAsyncContextWrapper
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
{
public SocketAsyncContextWrapper(SocketAsyncContext context) => Context = context;

internal SocketAsyncContext Context { get; }
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,11 @@
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Threading;
using System.Threading.Tasks;

namespace System.Net.Sockets
{
Expand Down Expand Up @@ -56,16 +54,21 @@ public bool TryRegister(SafeSocketHandle socket, out Interop.Error error)

private static readonly object s_lock = new object();

// In debug builds, force there to be 2 engines. In release builds, use half the number of processors when
// there are at least 6. The lower bound is to avoid using multiple engines on systems which aren't servers.
#pragma warning disable CA1802 // const works for debug, but needs to be static readonly for release
private static readonly int s_engineCount =
private static readonly int s_engineCount = GetEnginesCount();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should this be s_maxEngineCount? We won't always have this many, but we may grow to this many based on the number of concurrent sockets, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub I've suggested to remove that logic as part of this PR (#35800 (comment)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmds You are most probably right. The only use case for keeping it is a machine with many cores and very few connections. Which should be uncommon.

Would you prefer me to remove it now or would you like to do this in your upcoming PR that is going to enable the "inlining"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamsitnik remove it here, it is unrelated to inlining.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmds I am going to merge it as it is right now as I would really love to see the update numbers. I am going to send a PR with MinHandles logic removal today or tomorrow.


private static int GetEnginesCount()
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
{
if (uint.TryParse(Environment.GetEnvironmentVariable("SYSTEM_NET_SOCKETS_ENGINE_COUNT"), out uint count))
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
{
return (int)count;
}

#if DEBUG
2;
return 2; // use two engines to make sure that this code path is covered with tests
#else
Environment.ProcessorCount >= 6 ? Environment.ProcessorCount / 2 : 1;
return 1; // having a single epoll thread is optimal
#endif
#pragma warning restore CA1802
}

//
// The current engines. We replace an engine when it runs out of "handle" values.
Expand Down Expand Up @@ -127,9 +130,9 @@ public bool TryRegister(SafeSocketHandle socket, out Interop.Error error)
private IntPtr _outstandingHandles;

//
// Maps handle values to SocketAsyncContext instances.
// Maps handle values to SocketAsyncContext instances. Must be used under a lock!
//
private readonly ConcurrentDictionary<IntPtr, SocketAsyncContext> _handleToContextMap = new ConcurrentDictionary<IntPtr, SocketAsyncContext>();
private readonly Dictionary<IntPtr, SocketAsyncContextWrapper> _handleToContextMap = new Dictionary<IntPtr, SocketAsyncContextWrapper>();

//
// Queue of events generated by EventLoop() that would be processed by the thread pool
Expand Down Expand Up @@ -208,7 +211,11 @@ private IntPtr AllocateHandle(SocketAsyncContext context)
Debug.Assert(!IsFull, "Expected !IsFull");

IntPtr handle = _nextHandle;
_handleToContextMap.TryAdd(handle, context);
lock (_handleToContextMap)
{
Debug.Assert(handle != ShutdownHandle, "ShutdownHandle must not be added to the dictionary");
_handleToContextMap.TryAdd(handle, new SocketAsyncContextWrapper(context));
}

_nextHandle = IntPtr.Add(_nextHandle, 1);
_outstandingHandles = IntPtr.Add(_outstandingHandles, 1);
Expand All @@ -225,7 +232,14 @@ private void FreeHandle(IntPtr handle)

lock (s_lock)
{
if (_handleToContextMap.TryRemove(handle, out _))
bool removed = false;

lock (_handleToContextMap)
{
removed = _handleToContextMap.Remove(handle, out _);
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
}

if (removed)
{
_outstandingHandles = IntPtr.Subtract(_outstandingHandles, 1);
Debug.Assert(_outstandingHandles.ToInt64() >= 0, $"Unexpected _outstandingHandles: {_outstandingHandles}");
Expand Down Expand Up @@ -318,8 +332,10 @@ private void EventLoop()
{
bool shutdown = false;
Interop.Sys.SocketEvent* buffer = _buffer;
ConcurrentDictionary<IntPtr, SocketAsyncContext> handleToContextMap = _handleToContextMap;
Dictionary<IntPtr, SocketAsyncContextWrapper> handleToContextMap = _handleToContextMap;
ConcurrentQueue<SocketIOEvent> eventQueue = _eventQueue;
IntPtr shutdownHandle = ShutdownHandle;
SocketAsyncContext? context = null;
while (!shutdown)
{
int numEvents = EventBufferCount;
Expand All @@ -333,21 +349,17 @@ private void EventLoop()
Debug.Assert(numEvents > 0, $"Unexpected numEvents: {numEvents}");

bool enqueuedEvent = false;
for (int i = 0; i < numEvents; i++)
lock (_handleToContextMap)
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
{
IntPtr handle = buffer[i].Data;
if (handle == ShutdownHandle)
{
shutdown = true;
}
else
foreach (var socketEvent in new ReadOnlySpan<Interop.Sys.SocketEvent>(buffer, numEvents))
{
Debug.Assert(handle.ToInt64() < MaxHandles.ToInt64(), $"Unexpected values: handle={handle}, MaxHandles={MaxHandles}");
handleToContextMap.TryGetValue(handle, out SocketAsyncContext? context);
if (context != null)
IntPtr handle = socketEvent.Data;

if (handleToContextMap.TryGetValue(handle, out SocketAsyncContextWrapper contextWrapper) && (context = contextWrapper.Context) != null)
{
Interop.Sys.SocketEvents events = buffer[i].Events;
events = context.HandleSyncEventsSpeculatively(events);
Debug.Assert(handle.ToInt64() < MaxHandles.ToInt64(), $"Unexpected values: handle={handle}, MaxHandles={MaxHandles}");

Interop.Sys.SocketEvents events = context.HandleSyncEventsSpeculatively(socketEvent.Events);
if (events != Interop.Sys.SocketEvents.None)
{
var ev = new SocketIOEvent(context, events);
Expand All @@ -365,6 +377,10 @@ private void EventLoop()
// such code may keep the stack location live for longer than necessary
context = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure in the test NonDisposedSocket_SafeHandlesCollected may be because of the extra reference to the context from the wrapper, might need to also do contextWrapper = default to make the test work in debug builds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That test had been failing since Kount's PR was merged: #35846. The JIT will often make tmp copies such that nulling out is often insufficient for debug builds. Hence my wondering why we were doing so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rolling builds the seems to be failing periodically in release builds, seems like it might be something a bit different but needs to be looked into.

I'm also not too happy about nulling out the locals, seems like that's the only way the test would have passed in debug builds before my change. On one hand the test seemed fair in expecting that a non-disposed socket gets finalized. On another hand that may not happen in debuggable code at least, maybe a better approach would be to make sure it can't happen with quick JIT and enable the test only for release builds or something like that, not sure.

}
else if (handle == shutdownHandle)
{
shutdown = true;
}
}
}

Expand Down