Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PERF] Akka: move all dispatchers off of the dedicated thread pool #7117

Merged
merged 1 commit into from
Mar 11, 2024

Conversation

Aaronontheweb
Copy link
Member

@Aaronontheweb Aaronontheweb commented Mar 11, 2024

Changes

In the aftermath of .NET 6, I don't think ForkJoinExecutors and the DedicatedThreadPool are as necessary as they once were. Automatic thread injection when blocking is detected should resolve many of the blocking I/O issues we used to worry about.

On the flipside this should radically reduce idle CPU consumption of Akka.NET and make the platform more AOT-friendly, among other things. Will provide benchmarks momentarily.

Should resolve #5400

Checklist

For significant changes, please ensure that the following have been completed (delete if not relevant):

Latest dev Benchmarks

Include data from the relevant benchmark prior to this change here.

This PR's Benchmarks

Include data from after this change here.

@Aaronontheweb
Copy link
Member Author

Working on benchmarks now

@Aaronontheweb
Copy link
Member Author

RemotePingPong

This PR

OSVersion:                         Microsoft Windows NT 10.0.22631.0
ProcessorCount:                    16
ClockSpeed:                        0 MHZ
Actor Count:                       32
Messages sent/received per client: 200000  (2e5)
Is Server GC:                      True
Thread count:                      46

Num clients, Total [msg], Msgs/sec, Total [ms], Start Threads, End Threads
         1,  200000,    130040,    1538.65,            46,              73
         5, 1000000,    519211,    1926.39,            81,              95
        10, 2000000,    666445,    3001.28,           103,             107
        15, 3000000,    708049,    4237.05,           115,             115
        20, 4000000,    718908,    5564.23,           123,             123
        25, 5000000,    674673,    7411.83,           131,             107
        30, 6000000,    667557,    8988.76,           115,              97
Done..

dev

OSVersion:                         Microsoft Windows NT 10.0.22631.0
ProcessorCount:                    16
ClockSpeed:                        0 MHZ
Actor Count:                       32
Messages sent/received per client: 200000  (2e5)
Is Server GC:                      True
Thread count:                      107

Num clients, Total [msg], Msgs/sec, Total [ms], Start Threads, End Threads
         1,  200000,    125945,    1588.96,           107,             129
         5, 1000000,    533334,    1875.84,           137,             157
        10, 2000000,    668897,    2990.89,           165,             165
        15, 3000000,    665042,    4511.90,           173,             173
        20, 4000000,    662911,    6034.16,           181,             173
        25, 5000000,    685214,    7297.86,           181,             164
        30, 6000000,    685715,    8750.54,           172,             147
Done..

Pretty much what I expected and what we've seen when we've done these comparisons before - much lower thread counts and higher peak performance, but the median performance is about the same.

@Aaronontheweb
Copy link
Member Author

PingPong

Worth noting that none of the actors in this benchmark ever executed anything over a dedicated thread pool - the only thing that can be measured here is if there was CPU-steal happening with the idle DTPs in the background.

This PR

Warming up...                                                                                      
OSVersion:              Microsoft Windows NT 10.0.22631.0                                          
ProcessorCount:         16                                                                         
ClockSpeed:             0 MHZ                                                                      
Actor Count:            32                                                                         
Messages sent/received: 30000000  (3e7)                                                            
Is Server GC:           True                                                                       
Thread count:           35                                                                         
                                                                                                   
ActorBase    first start time: 12.55 ms                                                            
ReceiveActor first start time: 26.61 ms                                                            
                                                                                                   
            ActorBase                          ReceiveActor                                        
Throughput, Msgs/sec, Start [ms], Total [ms],  Msgs/sec, Start [ms], Total [ms]                    
         1, 38510000,      90.05,     869.30,  47021000,     159.50,     797.57                    
         5, 101351000,     160.16,     456.54,  89820000,      77.78,     411.97                   
        10, 98360000,      71.74,     377.40,  106382000,     141.85,     424.62                   
        15, 88495000,      28.48,     367.50,  99667000,      50.46,     351.48                    
        20, 98039000,      54.93,     361.54,  100334000,      53.72,     353.22                   
        30, 132158000,      26.19,     253.36,  112781000,      70.54,     337.05                  
        40, 134529000,      60.72,     284.55,  114068000,      23.92,     287.83                  
        50, 80000000,      91.68,     467.57,  114068000,      75.99,     339.40                   
        60, 114942000,      61.31,     323.29,  124481000,      20.98,     262.72                  
        70, 130434000,      52.85,     283.49,  114942000,      48.85,     309.97                  
        80, 86206000,      80.88,     429.07,  115384000,      28.00,     288.25                   
        90, 132743000,       1.94,     228.63,  118110000,       2.05,     256.29                  
       100, 135135000,      80.23,     302.33,  122950000,      70.30,     315.12                  
       200, 104529000,      34.60,     322.51,  106007000,       8.35,     292.01                  
       300, 99337000,     118.24,     420.45,  102389000,      22.10,     315.47                   
       400, 105633000,       2.41,     287.20,  92592000,      44.27,     368.80                   
       500, 128755000,       2.00,     235.49,  83333000,      80.80,     441.16                   
       600, 99667000,      93.94,     395.91,  79365000,       2.55,     381.00                    
       700, 104529000,      44.01,     331.52,  103806000,       1.93,     291.30                  
       800, 101010000,      89.55,     387.13,  117187000,       2.11,     258.34                  
       900, 120000000,       5.66,     255.67,  100334000,      85.41,     385.08                  
Done..                                                                                             

dev

Warming up...                                                                           
OSVersion:              Microsoft Windows NT 10.0.22631.0                               
ProcessorCount:         16                                                              
ClockSpeed:             0 MHZ                                                           
Actor Count:            32                                                              
Messages sent/received: 30000000  (3e7)                                                 
Is Server GC:           True                                                            
Thread count:           31                                                              
                                                                                        
ActorBase    first start time:  9.29 ms                                                 
ReceiveActor first start time: 24.67 ms                                                 
                                                                                        
            ActorBase                          ReceiveActor                             
Throughput, Msgs/sec, Start [ms], Total [ms],  Msgs/sec, Start [ms], Total [ms]         
         1, 25531000,     141.34,    1316.77,  49342000,     102.51,     711.46         
         5, 99009000,      94.37,     398.19,  78125000,       5.06,     389.93         
        10, 108303000,      65.01,     342.03,  110701000,      92.23,     363.76       
        15, 100000000,      67.17,     367.81,  108695000,      29.60,     306.08       
        20, 98039000,      50.93,     357.68,  102389000,       4.37,     298.07        
        30, 129310000,       3.51,     236.37,  131004000,       3.52,     232.82       
        40, 117187000,      11.35,     267.75,  115830000,      69.51,     329.03       
        50, 124481000,      20.03,     261.35,  126582000,       3.57,     241.46       
        60, 135135000,       4.37,     226.71,  128205000,       4.14,     239.01       
        70, 111111000,      10.22,     280.98,  115830000,      34.24,     294.06       
        80, 103448000,       3.44,     293.80,  98684000,       3.65,     307.79        
        90, 137614000,      16.91,     235.80,  105633000,       3.21,     288.17       
       100, 92307000,       6.23,     331.41,  127659000,       3.40,     239.02        
       200, 121951000,      52.24,     299.00,  126582000,       3.35,     241.19       
       300, 107526000,     107.12,     386.21,  120481000,      13.29,     262.91       
       400, 121457000,       3.69,     250.80,  92307000,      48.23,     374.13        
       500, 112359000,      45.80,     313.17,  111111000,      40.80,     311.52       
       600, 129870000,       3.29,     234.55,  115384000,       3.54,     263.69       
       700, 95541000,       2.90,     317.71,  98684000,      33.15,     337.28         
       800, 132743000,      35.58,     261.80,  103448000,      52.95,     342.99       
       900, 121457000,       3.14,     250.96,  78740000,       2.41,     383.67        
Done..                                                                                  

Pretty comparable performance - CPU steal not much of a factor when there's real work going on, by the looks of it.

Copy link
Member Author

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed changes

type = PinnedDispatcher
executor = "fork-join-executor"
type = Dispatcher
executor = "default-executor"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moves Akka.Persistence back over to the normal dispatcher when there isn't a custom one used by the plugin (there are none, IIRC)

@@ -570,7 +570,7 @@ akka {
### Default dispatcher for the remoting subsystem

default-remote-dispatcher {
executor = fork-join-executor
executor = default-executor
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moves Akka.Remote off of the dedicated thread pool

@@ -389,7 +389,7 @@ akka {
# default dispatcher)
internal-dispatcher {
type = "Dispatcher"
executor = "fork-join-executor"
executor = "default-executor"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moves the internal dispatcher off its own dedicated thread pool

@Aaronontheweb Aaronontheweb marked this pull request as ready for review March 11, 2024 19:11
Copy link
Contributor

@Arkatufus Arkatufus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Aaronontheweb Aaronontheweb merged commit 251622d into akkadotnet:dev Mar 11, 2024
8 of 12 checks passed
@Aaronontheweb Aaronontheweb deleted the remove-DTP-dispatchers branch March 11, 2024 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High CPU Load for idle clusters
2 participants