Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix performance regression in SSL handshake #66077

Merged
merged 2 commits into from
Mar 7, 2022

Conversation

rzikm
Copy link
Member

@rzikm rzikm commented Mar 2, 2022

Fixes Issue #66012

The perf regression seems to be due to SECPKG_ATTR_REMOTE_CERT_CHAIN being more expensive to query than SECPKG_ATTR_REMOTE_CERT_CONTEXT used until #65134. This PR swaps the order of their querying so that CERT_CHAIN is queried only when we really need the certificate before TLS handshake is completed (i.e. during LocalClientCertificateSelectionCallback).

local run of the affected benchmark:

 BenchmarkDotNet=v0.13.1.1702-nightly, OS=Windows 11 (10.0.22000.493/21H2)                                                                                                                
 Intel Core i9-10900K CPU 3.70GHz, 1 CPU, 20 logical and 10 physical cores                                                                                                                
 .NET SDK=7.0.100-preview.3.22151.18                                                                                                                                                      
   [Host]     : .NET 7.0.0 (7.0.22.11609), X64 RyuJIT                                                                                                                                     
   Job-CGHVCJ : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
   Job-GSIUQU : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
   Job-RHRBBQ : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
                                                                                                                                                                                          
 PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable,-bl:benchmarkdotnet.binlog  IterationTime=250.0000 ms                                                
 MaxIterationCount=20  MinIterationCount=15  WarmupCount=1                                                                                                                                
                                                                                                                                                                                          
 |                           Method |        Job |             Toolchain |     Mean |     Error |    StdDev |   Median |      Min |      Max | Ratio | RatioSD | Allocated | Alloc Ratio |
 |--------------------------------- |----------- |---------------------- |---------:|----------:|----------:|---------:|---------:|---------:|------:|--------:|----------:|------------:|
 | DefaultHandshakeContextIPv6Async | Job-CGHVCJ |    \7.0.0\corerun.exe | 1.677 ms | 0.0928 ms | 0.1069 ms | 1.632 ms | 1.523 ms | 1.846 ms |  0.99 |    0.10 |   5.59 KB |        1.00 |
 | DefaultHandshakeContextIPv6Async | Job-GSIUQU |     \main\corerun.exe | 1.736 ms | 0.0469 ms | 0.0521 ms | 1.726 ms | 1.656 ms | 1.850 ms |  1.02 |    0.07 |   5.58 KB |        0.99 |
 | DefaultHandshakeContextIPv6Async | Job-RHRBBQ | \reverted\corerun.exe | 1.709 ms | 0.0915 ms | 0.1054 ms | 1.679 ms | 1.545 ms | 1.856 ms |  1.00 |    0.00 |   5.62 KB |        1.00 |

The perf diff seems to be small on my machine, but maybe it is more pronounced on Win10 where the product benchmarks run.

@ghost
Copy link

ghost commented Mar 2, 2022

Tagging subscribers to this area: @dotnet/ncl, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes Issue #66012

The perf regression seems to be due to SECPKG_ATTR_REMOTE_CERT_CHAIN being more expensive to query than SECPKG_ATTR_REMOTE_CERT_CONTEXT used until #65134. This PR swaps the order of their querying so that CERT_CHAIN is queried only when we really need the certificate before TLS handshake is completed (i.e. during LocalClientCertificateSelectionCallback).

 BenchmarkDotNet=v0.13.1.1702-nightly, OS=Windows 11 (10.0.22000.493/21H2)                                                                                                                
 Intel Core i9-10900K CPU 3.70GHz, 1 CPU, 20 logical and 10 physical cores                                                                                                                
 .NET SDK=7.0.100-preview.3.22151.18                                                                                                                                                      
   [Host]     : .NET 7.0.0 (7.0.22.11609), X64 RyuJIT                                                                                                                                     
   Job-CGHVCJ : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
   Job-GSIUQU : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
   Job-RHRBBQ : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
                                                                                                                                                                                          
 PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable,-bl:benchmarkdotnet.binlog  IterationTime=250.0000 ms                                                
 MaxIterationCount=20  MinIterationCount=15  WarmupCount=1                                                                                                                                
                                                                                                                                                                                          
 |                           Method |        Job |             Toolchain |     Mean |     Error |    StdDev |   Median |      Min |      Max | Ratio | RatioSD | Allocated | Alloc Ratio |
 |--------------------------------- |----------- |---------------------- |---------:|----------:|----------:|---------:|---------:|---------:|------:|--------:|----------:|------------:|
 | DefaultHandshakeContextIPv6Async | Job-CGHVCJ |    \7.0.0\corerun.exe | 1.677 ms | 0.0928 ms | 0.1069 ms | 1.632 ms | 1.523 ms | 1.846 ms |  0.99 |    0.10 |   5.59 KB |        1.00 |
 | DefaultHandshakeContextIPv6Async | Job-GSIUQU |     \main\corerun.exe | 1.736 ms | 0.0469 ms | 0.0521 ms | 1.726 ms | 1.656 ms | 1.850 ms |  1.02 |    0.07 |   5.58 KB |        0.99 |
 | DefaultHandshakeContextIPv6Async | Job-RHRBBQ | \reverted\corerun.exe | 1.709 ms | 0.0915 ms | 0.1054 ms | 1.679 ms | 1.545 ms | 1.856 ms |  1.00 |    0.00 |   5.62 KB |        1.00 |
Author: rzikm
Assignees: rzikm
Labels:

area-System.Net.Security

Milestone: -

{
SSPIWrapper.QueryContextAttributes_SECPKG_ATTR_REMOTE_CERT_CONTEXT(GlobalSSPI.SSPISecureChannel, securityContext, out remoteContext);
// The query can fail if TLS handshake has not completed yet. In that case we fallback to querying
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we expect this to be rare?

Copy link
Member Author

@rzikm rzikm Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the only case when we need the remote cert before handshake completes is during the second call of the LocalClientCertificateSelectionCallback, which provides acceptable issuers and the server certificate as arguments. So the conditions for that happening should be:

  • Server requires client certificate
  • Client uses LocalClientCertificateSelectionCallback which during the first call returns null or a cert which does not match trusted issuers required by server.

I don't know how common that usage is, but since not many people complained that until #65134 the server cert was always null on Windows then I suppose it is indeed rare.

@wfurt
Copy link
Member

wfurt commented Mar 2, 2022

Can know upfront if we are in handshake or not and do only one of the calls as appropriate? And possibly know if the call is supported at all via static bool ? This fall-back logic still little bit worries me.

@rzikm
Copy link
Member Author

rzikm commented Mar 3, 2022

Can know upfront if we are in handshake or not and do only one of the calls as appropriate?

If I see correctly, the CertificateValidationPal.GetRemoteCertificate has 2 overloads, one called during TLS handshake and one after the handshake completes, so we can differentiate between those two situations. I will try it out.

And possibly know if the call is supported at all via static bool ?

I see a few options:

  • set a flag once we encounter first failure, i.e. check for SEC_E_UNSUPPORTED_FUNCTION inside SSPIWrapper and short-circuit all following calls to SSPIWrapper.QueryContextAttributes_SECPKG_ATTR_REMOTE_CERT_CHAIN
  • put OS version check in CertificateValidationPal.GetRemoteCertificate

I am not sure I like either of them. Since we can differentiate between in-handshake and after-handshake calls, I think we should just leave it be. Win8 and newer support the function (I am not sure about Win7, the test that requires it is disabled there) and besides, until now we already called QueryContextAttributes at least once, so there will be no additional perf hit on Win7 since the call will fail as before.

@rzikm
Copy link
Member Author

rzikm commented Mar 3, 2022

Updated benchmarks look promising

// * Summary *                                                                                                                                                                           
                                                                                                                                                                                         
BenchmarkDotNet=v0.13.1.1702-nightly, OS=Windows 11 (10.0.22000.493/21H2)                                                                                                                
Intel Core i9-10900K CPU 3.70GHz, 1 CPU, 20 logical and 10 physical cores                                                                                                                
.NET SDK=7.0.100-preview.3.22152.16                                                                                                                                                      
  [Host]     : .NET 7.0.0 (7.0.22.11609), X64 RyuJIT                                                                                                                                     
  Job-QSWYNC : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
  Job-YFPROJ : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
  Job-IHEQPI : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT                                                                                                                                   
                                                                                                                                                                                         
PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable,-bl:benchmarkdotnet.binlog  IterationTime=250.0000 ms                                                
MaxIterationCount=20  MinIterationCount=15  WarmupCount=1                                                                                                                                
                                                                                                                                                                                         
|                           Method |        Job |             Toolchain |     Mean |     Error |    StdDev |   Median |      Min |      Max | Ratio | RatioSD | Allocated | Alloc Ratio |
|--------------------------------- |----------- |---------------------- |---------:|----------:|----------:|---------:|---------:|---------:|------:|--------:|----------:|------------:|
| DefaultHandshakeContextIPv6Async | Job-QSWYNC |    \7.0.0\corerun.exe | 1.439 ms | 0.0178 ms | 0.0167 ms | 1.438 ms | 1.411 ms | 1.475 ms |  1.00 |    0.02 |   5.58 KB |        1.00 |
| DefaultHandshakeContextIPv6Async | Job-YFPROJ |     \main\corerun.exe | 1.489 ms | 0.0232 ms | 0.0217 ms | 1.488 ms | 1.455 ms | 1.519 ms |  1.03 |    0.03 |   5.58 KB |        1.00 |
| DefaultHandshakeContextIPv6Async | Job-IHEQPI | \reverted\corerun.exe | 1.446 ms | 0.0254 ms | 0.0237 ms | 1.444 ms | 1.416 ms | 1.502 ms |  1.00 |    0.00 |    5.6 KB |        1.00 |

@rzikm
Copy link
Member Author

rzikm commented Mar 3, 2022

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@wfurt wfurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me if it fixes the regression. I think it is OK if we don't get certificate in callback on Windows 7. It was like this for a while and it is unlikely that we get new major customers there.

@rzikm
Copy link
Member Author

rzikm commented Mar 7, 2022

CI failures are #66143, no relevant test failures to this change

@rzikm rzikm merged commit 7698a9a into dotnet:main Mar 7, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Apr 6, 2022
@karelz karelz added this to the 7.0.0 milestone Apr 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants