-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[x86] SequenceEqual returns false for equal spans #65292
Comments
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsRun: runtime-libraries-coreclr outerloop 20220213.4 Failed test:
Error message:
|
To my surprise I am able to reproduce it locally: .\build.cmd -c Release -subset clr+libs+libs.tests -arch x86
.\dotnet.cmd build .\src\libraries\System.IO.FileSystem\tests\System.IO.FileSystem.Tests.csproj /t:Test /p:Configuration=Release /p:TargetArchitecture=x86 /p:Outerloop=true |
Tagging subscribers to this area: @dotnet/area-system-memory Issue DetailsRun: runtime-libraries-coreclr outerloop 20220213.4 Failed test:
Error message:
|
It seems to be a bug in runtime/src/libraries/Common/tests/TestUtilities/System/AssertExtensions.cs Lines 445 to 471 in e23382c
I was able to reproduce it using the following test: public class Repro
{
[Fact]
public void TryToMimic()
{
while (true)
{
byte[] writerBytes = RandomNumberGenerator.GetBytes(10 * 1024 * 1024);
var readerBytes = new byte[writerBytes.Length];
writerBytes.CopyTo(readerBytes, 0);
Assert.True(writerBytes.SequenceEqual(readerBytes));
}
}
} @EgorBo is there any chance you could take a look? |
Thanks for the minimal repro, looking at it now |
I can't reproduce it locally 🙁 the
and the repro specifically finish just fine for me (win10-x64, core i7 8700K) |
I can't always reproduce it when running all |
ah, let me try to run it a bit longer then 👍 |
We have number of other tests failing on Windows x86 intermittently with very similar symptoms:
It looks like a Windows x86 specific GC hole to me. |
@jkotas could you please explain what makes you think it's a GC hole? I am just curious what symptoms GC holes have in general. I was expecting it to be a miss-aligned read since it does not reproduce always and there is no hang/AV. |
https://github.com/dotnet/runtime/blob/main/docs/coding-guidelines/clr-code-guide.md#211-how-gc-holes-are-created explains what GC hole is. The typical manifestations of GC holes are intermittent crashes and data corruptions. |
@jkotas thanks for the pointer!
Is my understanding correct that switching from managed to native memory should "solve" the GC hole issue for my small repro ? Because I am still able to repro it using the following code: const int size = 10 * 1024 * 1024;
while (true)
{
void* writerBytesPointer = NativeMemory.Alloc(size);
void* readerBytesPointer = NativeMemory.Alloc(size);
try
{
Span<byte> writerBytes = new Span<byte>(writerBytesPointer, size);
Span<byte> readerBytes = new Span<byte>(readerBytesPointer, size);
RandomNumberGenerator.Fill(writerBytes);
writerBytes.CopyTo(readerBytes);
Assert.True(writerBytes.SequenceEqual(readerBytes));
}
finally
{
NativeMemory.Free(readerBytesPointer);
NativeMemory.Free(writerBytesPointer);
}
} |
GC holes can be anywhere even in a simple C#-only code where JIT doesn't report gcinfo correctly so when GC interupts it doesn't update all regs correctly. Are you on the most recent Main btw? |
I am on:
|
If I run just this unit test, I am not able to repro. Should I get a memory dump? |
wouldn't hurt I'd guess. I've been running your repro in the FileSystem.IO test suite for 7 minutes already and still no signs of the bug |
@jkotas it does:
|
@VSadov I'm pretty sure the only path where AVX registers need careful handling is the restore path used in suspension. Fortunately, the AVX registers are not considered saved state for the purposes of function calls, so EH logic has to tolerate them being trashed anyways. |
@davidwrighton Right. extended registers are volatile and thus in a case of software exception that goes via In the fix (#65292) I am no longer using
Ideally, we will switch x86 to the same plan as x64, but it is a bit longer story and I think we need a quick fix for now. |
Should this be reopened to track the need for servicing backport? |
Should this also be ported to Preview2 (if there is time?) |
Preview 2 is basically complete. We will snap for Preview 3 in the coming weeks. In the meantime we can provide privates if a workaround is needed. |
@MaximLipnin @VSadov I assume this is not blocking anything any more as it is fixed in main and this issue is to track backports? |
Once we have a full fix, we will likely be taking the fix down level to all supported versions. |
the last tracked piece - porting to 3.1 has been completed. Closing. |
Run: runtime-libraries-coreclr outerloop 20220213.4
Failed test:
Error message:
The text was updated successfully, but these errors were encountered: