Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run on .NET 8 Arm64 seems to remove some parts of a conditional statement, probably only after Tier 1 JIT compilation #99954

Closed
Louis-Antoine-Blais-Morin opened this issue Mar 19, 2024 · 12 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Milestone

Comments

@Louis-Antoine-Blais-Morin
Copy link

Louis-Antoine-Blais-Morin commented Mar 19, 2024

Description

We have a code running that runs correctly on Windows x64 and that was running correctly on Arm64 with .NET 7.

Starting from .NET 8, we observed that one if statement does not seem to behave as written in the code.

Extract of the code is like this:

public void Select(Entity entity)
{
   if (ShouldSelectEntity(entity)
   {
      SelectEntity(entity);
   }
}

private void ShouldSelectEntity(Entity entity)
{
   if (entity.TestObject != null ||
      entity.Score <= m_scoreThreshold) // THIS DOES NOT SEEM TO WORK!!!
      return false;

   ... // Other complex conditions, returning true or false according to entity content.
}

private void SelectEntity(Entity entity)
{
   if (entity.Score <= m_scoreThreshold)
   {
      // We should never get here because of the test in 
      // ShouldSelectEntity.
      throw new Exception($"Should not have selected this object because Score {entity.Score} is lower than threshold {m_scoreThreshold}");
   }

   ... // Other stuff.
}

We observed that the exception in SelectEntity is thrown, even if it should never happens.

The code above is not sufficient to reproduce the bug. For the bug to be present, the above code, it needs to be part of a larger program. The initial program has been stripped greatly and is published here:
https://github.com/Louis-Antoine-Blais-Morin/JitBugArm64
It is still relatively large, but it seems that removing parts of the program seems to make the bug disappear. Maybe more effort can lead to a smaller code.

Reproduction Steps

  • Checkout code from
    https://github.com/Louis-Antoine-Blais-Morin/JitBugArm64
  • Open JitBugArm64.sln
  • Compile it in Release in VisualStudio 17.9.1.
  • Copy the content of bin\Release\net8.0 on a Arm64 platform running Linux and where .NET Runtime 8.0.3 is installed
  • dotnet run JitBugArm64.dll
  • An exception is thrown after few seconds

Expected behavior

The if statement should behave as expected in the C# code.
Therefore, no exception should be thrown in the code linked to this bug.

Actual behavior

The if statement in the ShouldSelectEntity does not behave correctly.
The function seems to returns true even if entity.Score > m_scoreThreshold.

This seems to happen after a few seconds of correct operation.
I suspect that the behavior starts to be incorrect after Tier 1 JIT compilation.

Regression?

This is a regression since the program is working correctly with .NET 7.
The program is also working correctly under Windows x64.

Known Workarounds

The following workarounds seem to fix the behavior:

  • Replace the if ( ... || ...) return false; by two individual if statements.
  • Add the following attribute on the ShouldSelectEntity method: [MethodImpl(MethodImplOptions.NoInlining)]
  • Remove the <DebugSymbols>true</DebugSymbols> in the .csproj

Configuration

$ dotnet --list-runtimes
Microsoft.AspNetCore.App 8.0.3 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 8.0.3 [/usr/share/dotnet/shared/Microsoft.NETCore.App]`

Processor Architecture is Arm64 Cortex A53 / Armv8-A.

Linux is "Yocto Kirkstone".

Other information

No response

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 19, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Mar 19, 2024
@jakobbotsch
Copy link
Member

cc @dotnet/jit-contrib

@JulieLeeMSFT
Copy link
Member

@kunalspathak PTAL.

@AndyAyersMS
Copy link
Member

@kunalspathak can you give an update on this? Or if you're tied up can we find somebody else to dig in?

@kunalspathak
Copy link
Member

@kunalspathak can you give an update on this? Or if you're tied up can we find somebody else to dig in?

Sorry, I forgot about this. I was able to repro this and I can confirm that this doesn't repro on .NET 9 main. I will investigate.

@kunalspathak
Copy link
Member

kunalspathak commented Mar 27, 2024

The problem is the reverse condition was wrongly calculated as seen in screenshot( left= net9/correct, right= net8/wrong)

image

We fixed it in #92316 and might need to backport.

@kunalspathak
Copy link
Member

btw, this is easily reproducible with DOTNET_TieredCompilation=0 and marking Select as NoInlining and ShouldSelectEntity as AggresivelyInline. I also changed ScoreThreshold=10.0f and Score = (float)0.0f,//random.NextDouble(),

        [MethodImpl(MethodImplOptions.NoInlining)]
        private void Select(Entity trackingObject)
        {
            if (ShouldSelectEntity(trackingObject))
            {
                SelectEntity(trackingObject);
            }
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private bool ShouldSelectEntity(Entity entity)
        {
            ...
        }

@kunalspathak
Copy link
Member

We fixed it in #92316 and might need to backport.

I verified this fix locally and it doesn't throw Exception anymore.

@JulieLeeMSFT JulieLeeMSFT added this to the 9.0.0 milestone Mar 29, 2024
@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Mar 29, 2024
@kunalspathak
Copy link
Member

This should be part of the upcoming .NET 8.0.5

@AndyAyersMS
Copy link
Member

@Louis-Antoine-Blais-Morin .NET 8.0.5 is now available -- when you get a chance can you verify it fixes this issue for you?

@Louis-Antoine-Blais-Morin
Copy link
Author

@AndyAyersMS
Thanks for all this work. We followed it closely.
With the fix now available, we are now trying once again to switch to .NET 8 (because of the problem, we stayed in .NET 7).
I will inform you about the results in the coming days.

@kunalspathak
Copy link
Member

@Louis-Antoine-Blais-Morin - did you get a chance to validate this?

@JulieLeeMSFT JulieLeeMSFT assigned TIHan and unassigned kunalspathak Jul 25, 2024
@JulieLeeMSFT JulieLeeMSFT added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jul 25, 2024
@Louis-Antoine-Blais-Morin
Copy link
Author

Good morning,

Providing feedback on this got of my radar for a while.

I had time to test the correction today:

  • We had updated our Linux image to .NET runtime 8.0.5
  • I removed the decorator [MethodImpl(MethodImplOptions.NoInlining)] that was in our code as a workaround
  • Code runs correctly

So, as far as I can see, the bug is fixed.
Thanks a lot!

@dotnet-policy-service dotnet-policy-service bot added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Jul 29, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Sep 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Projects
None yet
Development

No branches or pull requests

6 participants