Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable optimizations for PopCount #99796

Merged
merged 2 commits into from
Mar 15, 2024
Merged

disable optimizations for PopCount #99796

merged 2 commits into from
Mar 15, 2024

Conversation

mangod9
Copy link
Member

@mangod9 mangod9 commented Mar 15, 2024

avoid using an optimization which might fail on non-SSE4 cpus. Latest VC toolset seems to be optimizing the sequence to popcnt instruction which is problematic.

avoid using an optimization which might fail on non-SSE4 cpus.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 15, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@@ -3436,6 +3436,11 @@ uint32_t BitOperations::Log2(uint64_t value)
// Return Value:
// The population count (number of bits set) of value
//
#if defined(_MSC_VER)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this disables inlining of these methods too, so we might need to disable the specific optimization if this is perf critical.

@mangod9
Copy link
Member Author

mangod9 commented Mar 15, 2024

Looks like it's already fixed in MSVC?

Hmm, interesting. Guess we can pickup the new toolset then -- not sure about the feasibility?

#if defined(_MSC_VER)
// Disable optimizations for PopCount to avoid the compiler from generating intrinsics
// not supported on all platforms.
#pragma optimize("", off)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we root caused and found a minimal repro for this?

Given popcnt is not a x64 baseline instruction, it should not be getting generated without explicit opt in (meaning either we have a cmake/config issue on our end or MSVC has a bug)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's an MSVC issue, we should ensure an appropriate issue is filed against the compiler to track them fixing this.

Alternatively, if Windows has changed requirements at all, we should ensure it's documented and appropriately take advantage of these requirements (I vaguely remember seeing some change where Windows 11 may require SSE4.2 instructions)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, if Windows has changed requirements at all, we should ensure it's documented and appropriately take advantage of these requirements (I vaguely remember seeing some change where Windows 11 may require SSE4.2 instructions)

IIRC next update of Windows 11 requires popcnt, however I'd say that dotnet shouldn't require it as it still supports 10 and Server 2012.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to build a repro. I have verified within the debugger that the compiler does generate popcnt, and as Egor points out above looks like a fix has been made recently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it has been fixed already, you should be able to put the workaround under _MSC_VER or _MSC_FULL_VER ifdef to make self-documenting and self-healing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah checking if we can just pickup the tool update.

IIRC next update of Windows 11 requires popcnt, however I'd say that dotnet shouldn't require it as it still supports 10 and Server 2012.

Correct Win11 24H2 is expected to require it, but yeah that is a separate discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it has been fixed already, you should be able to put the workaround under _MSC_VER or _MSC_FULL_VER ifdef to make self-documenting and self-healing.

makes sense, but we will need to check the exact versions which had the issue (since it affected both 17.8 and 17.9).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the links I shared:

has been fixed in 17.8.6 and 17.9.0 Preview 5

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is a temporary change, not worth special casing versions in case we get the numbers wrong. The toolset update is going to be in the next few weeks and I can create a revert PR.

Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved. we should consider only doing this for 8 servicing, as we will take a toolset that has the fix shortly

@mangod9 mangod9 merged commit aee4957 into dotnet:main Mar 15, 2024
129 checks passed
@mangod9
Copy link
Member Author

mangod9 commented Mar 15, 2024

/backport to release/8.0-staging

Copy link
Contributor

Started backporting to release/8.0-staging: https://github.com/dotnet/runtime/actions/runs/8301063551

@EatonZ
Copy link
Contributor

EatonZ commented Mar 29, 2024

Hello, this bug is causing a bit of a headache for users of my software. It appears Visual Studio v17.9.5 includes an out-of-band fix for this, but updating the DLL on the end-user system does not resolve the crash. I'd like to help my users by providing some type of temporary fix while we wait for patch Tuesday. Is there any way to do this?

@mangod9
Copy link
Member Author

mangod9 commented Mar 29, 2024

@EatonZ, can you elaborate on whether .NET 8.0.3 is causing issues for you or Visual Studio?

@EatonZ
Copy link
Contributor

EatonZ commented Mar 29, 2024

@mangod9 .NET 8.0.3 runtime

@EatonZ
Copy link
Contributor

EatonZ commented Mar 29, 2024

Some of my users are getting the exit code 0x80131506 crash, which this PR seems to address. I would like to provide them with some type of fix, because right now they cannot use my software at all. The only workaround is to tell them to use another PC, which is not the best answer.

@mangod9
Copy link
Member Author

mangod9 commented Mar 29, 2024

Ok, so I am assuming you have tried to install the VS patch? That wouldn't fix it since your app might be using .NET from a different installed location.

@EatonZ
Copy link
Contributor

EatonZ commented Mar 29, 2024

@mangod9 My users doesn't have VS installed - only the runtime.
It looks like the patched DLL is clrjit.dll, so I asked them to replace the file here, but that doesn't seem to have any effect. Is there another step I am missing?

  • C:\Program Files\dotnet\shared\Microsoft.NETCore.App\8.0.3

@mangod9
Copy link
Member Author

mangod9 commented Mar 29, 2024

Some of my users are getting the exit code 0x80131506 crash

That error code is fairly generic, so have you been able to check if it's related to the popcnt invalid instruction issue?

@EatonZ
Copy link
Contributor

EatonZ commented Mar 29, 2024

@mangod9 Not sure how to fully verify, but after suddenly getting many user reports about this crash out of nowhere, and seeing the fix in Visual Studio just published, it seems to me there is a chance they are related.

@mangod9
Copy link
Member Author

mangod9 commented Mar 29, 2024

are you able to capture a crash dump of the failure? That would help with investigating further, note that this particular issue should only affect 10+ year old machines, so if its more prevalent it's something different. Thx!

@EatonZ
Copy link
Contributor

EatonZ commented Mar 29, 2024

@mangod9 Can you share instructions on how to capture a crash dump for this? I'll see what I can do.

@mangod9
Copy link
Member Author

mangod9 commented Mar 29, 2024

Here is some documentation of configuring WER to capture dumps: https://learn.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps. Configure it to collect a full dump. Hope this helps.

@EatonZ
Copy link
Contributor

EatonZ commented Mar 30, 2024

@mangod9 I have confirmed with a user that uninstalling the 8.0.3 runtime and installing 8.0.2 will fix the issue. So it would seem this is indeed related. I will recommend my users install that version until you release 8.0.4 next month.

mangod9 added a commit to mangod9/runtime that referenced this pull request Apr 11, 2024
jkotas pushed a commit that referenced this pull request Apr 12, 2024
github-actions bot pushed a commit that referenced this pull request Apr 18, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Apr 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants