-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[6.0] Port "Change profilers to use thread local evacuation counters (#59741)" to 6.0 #60116
Conversation
* Change profilers to use thread local evacuation counters Change to prefix increment * get rid of lambdas * Fix jit inlining, fix R2R too * Remove VolatilePtr<> from helpers * Get rid of additionalData argument
Tagging subscribers to this area: @tommcdon Issue DetailsDescriptionWhen running on 6.0 RC1 the Bing team detected a regression in P95 latency. After investigation it was tracked down to being caused by the notification profiler feature for three reasons
TestingCI tests, diagnostics tests, local profiler stress tests Confirmed by Bing that it solves the P95 regression RiskThe risk of this change is that it could affect performance or introduce a subtle profiler bug. For performance we validated that Bing performance is as we expect For profiler tests we ran the suite of profiler tests, as well as one off stress testing of attaching and detaching multiple profilers in a loop for hours
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. Please make sure to get several code reviews, we should take for consideration in .NET 6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey David this looks good! I spent a while scrutinizing and didn't find anything that is likely to be significant, but a couple things inline to check out.
|
||
FORCEINLINE BOOL ProfControlBlock::RequiresGenericsContextForEnterLeave() | ||
{ | ||
return AnyProfilerPassesCondition(&RequiresGenericsContextForEnterLeaveHelper); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that you should change it, but do notification profilers support ELT? I didn't recall that they did so checking all of them seemed unnecessary.
|
||
return AnyProfilerPassesCondition([](ProfilerInfo *pProfilerInfo) { return pProfilerInfo->curProfStatus.Get() >= kProfStatusActive; }); | ||
return (&g_profControlBlock)->mainProfilerInfo.pProfInterface.Load() != NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't spot a bug here, but the change in semantics makes the CHECK_PROFILER_STATUS() macro very misleading. That macro relies on this call to avoid delivering notifications to profilers that are not in the active state. Presumably that filtering no longer matters because the filtering occured earlier in DoProfilerCallback, but it is confusing to see the macro not doing correctly what it claims to do.
I reviewed the other sites that called CORProfilerPresent() to see if any of them would be negatively affected and didn't spot anything else that was clearly suspicious, but it would be good for you to scan for it too if you hadn't already. I'm fine not changing the CHECK_PROFILER_STATUS() macro for .NET 6, but it would be good to improve the situation in main.
} | ||
|
||
inline BOOL CORProfilerTrackConditionalWeakTableElements() | ||
FORCEINLINE BOOL CORProfilerTrackConditionalWeakTableElements() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test below for IsCallback5Supported iterates all profilers so this test is probably much more expensive than it used to be. If it is used in a very hot path there may be a perf issue lurking.
//--------------------------------------------------------------- | ||
// Why volatile? | ||
// See code:ProfilingAPIUtility::InitializeProfiling#LoadUnloadCallbackSynchronization. | ||
Volatile<DWORD> m_dwProfilerEvacuationCounters[MAX_NOTIFICATION_PROFILERS + 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you made the right choice to translate the scheme directly from global to per-thread for now. If future performance analysis shows that we need improvements in notification callback perf or per-thread memory usage there are some techniques we could apply to optimize this.
Description
When running on 6.0 RC1 a customer detected a regression in P95 latency. After investigation it was tracked down to being caused by the notification profiler feature for three reasons
This fix addresses all three of these issues
Testing
CI tests, diagnostics tests, local profiler stress tests
Confirmed by the customer that it solves the P95 regression
Risk
The risk of this change is that it could affect performance or introduce a subtle profiler bug.
For performance we validated that customer app performance is as we expect
For profiler tests we ran the suite of profiler tests, as well as one off stress testing of attaching and detaching multiple profilers in a loop for hours