Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avx2 code causes AccessViolationExceptions with AVX512 available #92357

Closed
TechPizzaDev opened this issue Sep 20, 2023 · 9 comments
Closed

Avx2 code causes AccessViolationExceptions with AVX512 available #92357

TechPizzaDev opened this issue Sep 20, 2023 · 9 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@TechPizzaDev
Copy link
Contributor

Description

Avx2 code seems to throw unexpected AccessViolations when AVX512 is available.

Reproduction Steps

  1. Clone repo https://github.com/TechPizzaDev/SoftwareRasterizer
  2. Checkout branch net8-upgrade
  3. Download Castle folder from https://github.com/rawrunprotected/rasterizer/tree/master/SoftwareRasterizer and put next to built SoftwareRasterizer.exe
  4. Run project in Release mode to crash (works in Debug mode)

Expected behavior

No crash.

Actual behavior

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at SoftwareRasterizer.Avx2Rasterizer`1[[SoftwareRasterizer.FmaX86, SoftwareRasterizer, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]].rasterizeLoop(UInt32, UInt16*, System.Runtime.Intrinsics.Vector128`1<Single>*, System.Runtime.Intrinsics.Vector128`1<Int32>*, System.Runtime.Intrinsics.Vector128`1<Single>*, System.Runtime.Intrinsics.Vector128`1<Single>*, System.Runtime.Intrinsics.Vector128`1<Single>*, UInt32*, UInt32*, UInt32*, UInt32*, Boolean)
   at SoftwareRasterizer.Avx2Rasterizer`1[[SoftwareRasterizer.FmaX86, SoftwareRasterizer, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]].rasterize[[SoftwareRasterizer.Rasterizer+NearClipped, SoftwareRasterizer, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]](SoftwareRasterizer.Occluder ByRef)
   at SoftwareRasterizer.Main.WndProc(TerraFX.Interop.Windows.HWND, UInt32, TerraFX.Interop.Windows.WPARAM, TerraFX.Interop.Windows.LPARAM)
   at TerraFX.Interop.Windows.Windows.UpdateWindow(TerraFX.Interop.Windows.HWND)
   at TerraFX.Interop.Windows.Windows.UpdateWindow(TerraFX.Interop.Windows.HWND)
   at SoftwareRasterizer.Main.wWinMain(TerraFX.Interop.Windows.HINSTANCE)
   at SoftwareRasterizer.Program.Main(System.String[])

Regression?

Regression in net8, worked on net7 and net6.

Known Workarounds

Setting DOTNET_EnableAVX512F=0 makes it work as expected in Release mode. The project also works in Debug mode.

Configuration

.NET: 8.0.0-rc.1.23419.4
OS: Microsoft Windows 11 Home (10.0.22621 version 22621)
CPU: 11th Gen Intel(R) Core(TM) i3-1115G4 @ 3.00GHz
Arch: x64

Other information

There seems to be a single codegen difference:

    C4E2793DC1           vpmaxsd  xmm0, xmm0, xmm1
-   C4E2793905EC070000   vpminsd  xmm0, xmm0, xmmword ptr [reloc @RWD80]      # DOTNET_EnableAVX512F=0
+   62F27D183905EB070000 vpminsd  xmm0, xmm0, dword ptr [reloc @RWD80] {1to4} # DOTNET_EnableAVX512F=1
    C5F8298590FCFFFF     vmovaps  xmmword ptr [rbp-0x370], xmm0

DOTNET_EnableAVX512F=0: working disasm.txt
DOTNET_EnableAVX512F=1: broken disasm.txt

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 20, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Sep 20, 2023
@ghost
Copy link

ghost commented Sep 20, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Avx2 code seems to throw unexpected AccessViolations when AVX512 is available.

Reproduction Steps

  1. Clone repo https://github.com/TechPizzaDev/SoftwareRasterizer
  2. Checkout branch net8-upgrade
  3. Download Castle folder from https://github.com/rawrunprotected/rasterizer/tree/master/SoftwareRasterizer and put next to built SoftwareRasterizer.exe
  4. Run project in Release mode to crash (works in Debug mode)

Expected behavior

No crash.

Actual behavior

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at SoftwareRasterizer.Avx2Rasterizer`1[[SoftwareRasterizer.FmaX86, SoftwareRasterizer, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]].rasterizeLoop(UInt32, UInt16*, System.Runtime.Intrinsics.Vector128`1<Single>*, System.Runtime.Intrinsics.Vector128`1<Int32>*, System.Runtime.Intrinsics.Vector128`1<Single>*, System.Runtime.Intrinsics.Vector128`1<Single>*, System.Runtime.Intrinsics.Vector128`1<Single>*, UInt32*, UInt32*, UInt32*, UInt32*, Boolean)
   at SoftwareRasterizer.Avx2Rasterizer`1[[SoftwareRasterizer.FmaX86, SoftwareRasterizer, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]].rasterize[[SoftwareRasterizer.Rasterizer+NearClipped, SoftwareRasterizer, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]](SoftwareRasterizer.Occluder ByRef)
   at SoftwareRasterizer.Main.WndProc(TerraFX.Interop.Windows.HWND, UInt32, TerraFX.Interop.Windows.WPARAM, TerraFX.Interop.Windows.LPARAM)
   at TerraFX.Interop.Windows.Windows.UpdateWindow(TerraFX.Interop.Windows.HWND)
   at TerraFX.Interop.Windows.Windows.UpdateWindow(TerraFX.Interop.Windows.HWND)
   at SoftwareRasterizer.Main.wWinMain(TerraFX.Interop.Windows.HINSTANCE)
   at SoftwareRasterizer.Program.Main(System.String[])

Regression?

Regression in net8, worked on net7 and net6.

Known Workarounds

Setting DOTNET_EnableAVX512F=0 makes it work as expected in Release mode. The project also works in Debug mode.

Configuration

.NET: 8.0.0-rc.1.23419.4
OS: Microsoft Windows 11 Home (10.0.22621 version 22621)
CPU: 11th Gen Intel(R) Core(TM) i3-1115G4 @ 3.00GHz
Arch: x64

Other information

There seems to be a single codegen difference:

    C4E2793DC1           vpmaxsd  xmm0, xmm0, xmm1
-   C4E2793905EC070000   vpminsd  xmm0, xmm0, xmmword ptr [reloc @RWD80]      # DOTNET_EnableAVX512F=0
+   62F27D183905EB070000 vpminsd  xmm0, xmm0, dword ptr [reloc @RWD80] {1to4} # DOTNET_EnableAVX512F=1
    C5F8298590FCFFFF     vmovaps  xmmword ptr [rbp-0x370], xmm0

DOTNET_EnableAVX512F=0: working disasm.txt
DOTNET_EnableAVX512F=1: broken disasm.txt

Author: TechPizzaDev
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@tannergooding
Copy link
Member

Just at a glance, the raw decoding is:

- vpminsd xmm0, xmm0, XMMWORD PTR [rip+0x7ec]
+ vpminsd xmm0, xmm0, DWORD PTR [rip+0x7eb] {1to4}

and so that looks to be fine/generally correct.

It's possible that the address being encoded is somehow incorrect and that's probably the thing that needs to be checked first

@TechPizzaDev
Copy link
Contributor Author

Here is codegen for DOTNET_TieredCompilation=0:

DOTNET_EnableAVX512F=0: working disasm.txt

DOTNET_EnableAVX512F=1: broken disasm.txt

@tannergooding
Copy link
Member

tannergooding commented Sep 20, 2023

I'm not able to reproduce on my Zen4 machine.

Edit: Nevermind, the checked in project had DOTNET_EnableAVX512F=0 already set and I missed that

@tannergooding
Copy link
Member

Looks like, firstBlocks is corrupted (for some reason) and that causes the AV's because pPrimitiveOut can arbitrarily be an incorrect address.

It is non deterministic across runs when AVX512 is enabled.

Still investigating to determine where that is being corrupted

@tannergooding
Copy link
Member

Looks like the broadcast is getting dropped from and its trying to load the index directly:

Vector256<int> firstBlockIdx = Avx2.Add(Avx2.MultiplyLow(minY.AsInt16(), Vector256.Create((int)m_blocksX).AsInt16()).AsInt32(), minX);
Avx.StoreAligned((int*)firstBlocks, firstBlockIdx);

@tannergooding
Copy link
Member

Simple repro:

Avx2.MultiplyLow(x + x, Vector256.Create(y).AsInt16())

the Create(int).AsInt16() breaks it, likely because the broadcast is over a 32-bit value
so I need to find where it's failing to validate the containing node's operand size

@JulieLeeMSFT JulieLeeMSFT added this to the 8.0.0 milestone Sep 20, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Sep 20, 2023
@JulieLeeMSFT
Copy link
Member

Hi @tannergooding, did you have a chance to look into this?

@tannergooding
Copy link
Member

This was already resolved with #92414

it just didn’t get linked/auto-closed on merge

@ghost ghost locked as resolved and limited conversation to collaborators Oct 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

3 participants