System.Numerics.Tensor test failures on CoreCLR #97297

akoeplinger · 2024-01-22T11:32:05Z

Started happening after #97192 was merged. It looks like it's happening only in Checked coreclr configuration.

        /_/src/libraries/System.Numerics.Tensors/tests/TensorPrimitives.Generic.cs(215,0): at System.Numerics.Tensors.Tests.GenericFloatingPointNumberTensorPrimitivesTests`1.SpanDestinationFunctions_SpecialValues(SpanDestinationDelegate tensorPrimitivesMethod, Func`2 expectedMethod)
           at InvokeStub_GenericFloatingPointNumberTensorPrimitivesTests`1.SpanDestinationFunctions_SpecialValues(Object, Span`1)
           at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    System.Numerics.Tensors.Tests.DoubleGenericTensorPrimitives.SpanDestinationFunctions_ValueRange(tensorPrimitivesMethod: SpanDestinationDelegate { Method = Void ReciprocalEstimate[Double](System.ReadOnlySpan`1[System.Double], System.Span`1[System.Double]), Target = null }, expectedMethod: Func`2 { Method = Double ReciprocalEstimate(Double), Target = null }) [FAIL]
      Assert.All() Failure: 33 out of 201 items in the collection did not pass.
      [27]:  Item:  Tuple (4, -19)
             Error: Assert.Equal() Failure: Values differ
                    Expected: -0.052490234375
                    Actual:   -0.052631578947368418
      [28]:  Item:  Tuple (4, -16)
             Error: Assert.Equal() Failure: Values differ
                    Expected: -0.0623779296875
                    Actual:   -0.0625

/cc @stephentoub @tannergooding

The text was updated successfully, but these errors were encountered:

ghost · 2024-01-22T11:32:09Z

Tagging subscribers to this area: @dotnet/area-system-numerics-tensors
See info in area-owners.md if you want to be subscribed.

Issue Details

Started happening after #97192 was merged. It looks like it's happening only in Checked configuration.

        /_/src/libraries/System.Numerics.Tensors/tests/TensorPrimitives.Generic.cs(215,0): at System.Numerics.Tensors.Tests.GenericFloatingPointNumberTensorPrimitivesTests`1.SpanDestinationFunctions_SpecialValues(SpanDestinationDelegate tensorPrimitivesMethod, Func`2 expectedMethod)
           at InvokeStub_GenericFloatingPointNumberTensorPrimitivesTests`1.SpanDestinationFunctions_SpecialValues(Object, Span`1)
           at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    System.Numerics.Tensors.Tests.DoubleGenericTensorPrimitives.SpanDestinationFunctions_ValueRange(tensorPrimitivesMethod: SpanDestinationDelegate { Method = Void ReciprocalEstimate[Double](System.ReadOnlySpan`1[System.Double], System.Span`1[System.Double]), Target = null }, expectedMethod: Func`2 { Method = Double ReciprocalEstimate(Double), Target = null }) [FAIL]
      Assert.All() Failure: 33 out of 201 items in the collection did not pass.
      [27]:  Item:  Tuple (4, -19)
             Error: Assert.Equal() Failure: Values differ
                    Expected: -0.052490234375
                    Actual:   -0.052631578947368418
      [28]:  Item:  Tuple (4, -16)
             Error: Assert.Equal() Failure: Values differ
                    Expected: -0.0623779296875
                    Actual:   -0.0625

/cc @stephentoub @tannergooding

Author:	akoeplinger
Assignees:	-
Labels:	`area-System.Numerics.Tensors`, `disabled-test`
Milestone:	-

dotnet#97297 dotnet#97295 dotnet#97296 Caused by dotnet#97192

akoeplinger · 2024-01-22T12:24:48Z

Hmm locally I see it in coreclr Release too (osx-arm64)

#97297 #97295 #97296 Caused by #97192

stephentoub · 2024-01-22T13:27:27Z

@tannergooding, does the lack of precision make sense to you here? The test should be easily fixed by changing the equality check to have a higher threshold, but I want to make sure it's desirable here before I do. The failures all appear to be with ReciprocalEstimate and ReciprocalSqrtEstimate, for float and double.

pavelsavara · 2024-01-22T13:48:13Z

Perhaps there is the same or similar problem on mono/wasm ?
Here we have linearly slow execution and then timeout. This one is multi-threaded build.
Or I have not notice that problem before.

Log
Build

[12:59:07] info: [2024-01-22T12:59:07.585Z] [PASS] System.Numerics.Tensors.Tests.SingleGenericTensorPrimitives.SpanDestinationFunctions_SpecialValues(tensorPrimitivesMethod: SpanDestinationDelegate { Method = Void Log[Single](System.ReadOnlySpan`1[System.Single], System.Span`1[System.Single]), Target = null }, expectedMethod: Func`2 { Method = Single Log(Single), Target = null })
[12:59:11] fail: Tests timed out. Killing driver service pid 80
[12:59:11] fail: Application has finished with exit code TIMED_OUT but 0 was expected
XHarness exit code: 71 (GENERAL_FAILURE)

stephentoub · 2024-01-22T13:49:41Z

@akoeplinger opened #97295 and #97296 for mono.

akoeplinger · 2024-01-22T14:13:00Z

#97301 disabled the tests on Mono JIT but still runs them on interpreter since I saw them passing there locally and on CI on desktop, it's possible that we still see an issue on wasm+interpreter.

tannergooding · 2024-01-22T15:59:14Z

does the lack of precision make sense to you here? The test should be easily fixed by changing the equality check to have a higher threshold, but I want to make sure it's desirable here before I do. The failures all appear to be with ReciprocalEstimate and ReciprocalSqrtEstimate, for float and double.

These look to be failing for double because it isn't handled as part of ReciprocalEstimateOperator and so the vectorized code is just doing 1 / x which is going to be much more accurate than the explicit Estimate APIs for some platforms.

For x64, float guarantees a relativeError <= 1.5 * 2^-12, which is no more than 0.0003662109375, while the allowed tolerance being used is actually 0.0001. double doesn't have an accelerated instruction for older hardware and so just uses 1 / x. AVX512 hardware does support both float and double, with a gurantee that relativeError <= 2^-14 (0.00006103515625); however, we are not currently utilizing this instruction.

For Arm64, both float and double are supported but they don't document the maximum relative error, just the algorithm (and I haven't done the math to determine what the actual maximum relative error would be). The ReciprocalSqrt does one iteration of Newton-Raphson which gives approx relativeError <= 1.5 * 2^-7

For the System.Math/MathF tests we use the following for values:

8.8817841970012523e-16 - maximum relative error for double (roughly 2^-50)
4.76837158e-07 - maximum relative error for float (roughly 2^-21)
1.171875e-02 - maximum relative error for estimates

These values are then scaled up or down based on floor(log10(input)) + 1. So 0.X uses the maximum relative error as given, X.X multiplies it by 10, XX.X multiplies it by 100, 0.0X divides it by 10, and so forth.

It's not the perfect system and there are better/more accurate ways to adjust it based on the power of log2(input) instead, but this is a fast/simple method that accounts for it while keeping things generally as expected.

stephentoub · 2024-01-22T16:01:17Z

Thanks, @tannergooding. So, short answer is we should just update the test for now with a higher threshold, right? (And update the reciprocal(sqrt) operators to use the relevant API on Arm)

tannergooding · 2024-01-22T16:09:55Z

Yeah, I'd use 1.171875e-02 for now since we're currently just sharing all the logic between the tests.

More ideally we'd allow each function to specify its own tolerance so we can be more strict with APIs like Sin/Cos and less strict with APIs like SqrtEstimate

dotnet#97297 dotnet#97295 dotnet#97296 Caused by dotnet#97192

akoeplinger added area-System.Numerics.Tensors disabled-test The test is disabled in source code against the issue labels Jan 22, 2024

ghost added the untriaged New issue has not been triaged by the area owner label Jan 22, 2024

akoeplinger added this to the 9.0.0 milestone Jan 22, 2024

akoeplinger removed the untriaged New issue has not been triaged by the area owner label Jan 22, 2024

akoeplinger mentioned this issue Jan 22, 2024

Disable failing System.Numerics.Tensors tests #97301

Merged

akoeplinger added a commit to akoeplinger/runtime that referenced this issue Jan 22, 2024

Disable failing System.Numerics.Tensors tests

ec22f2d

dotnet#97297 dotnet#97295 dotnet#97296 Caused by dotnet#97192

stephentoub pushed a commit that referenced this issue Jan 22, 2024

Disable failing System.Numerics.Tensors tests (#97301)

ebd5cb6

#97297 #97295 #97296 Caused by #97192

stephentoub self-assigned this Jan 22, 2024

stephentoub mentioned this issue Jan 22, 2024

Specify higher equality tolerance for ReciprocalEstimate tests #97332

Merged

ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 22, 2024

This was referenced Jan 23, 2024

JIT: Generalize FlowGraphNaturalLoop::AnalyzeIteration #97327

Merged

JIT: Skip BBJ_CALLFINALLYRET in BasicBlock::VisitEHSuccs #97184

Merged

tmds pushed a commit to tmds/runtime that referenced this issue Jan 23, 2024

Disable failing System.Numerics.Tensors tests (dotnet#97301)

e579c91

dotnet#97297 dotnet#97295 dotnet#97296 Caused by dotnet#97192

stephentoub closed this as completed in #97332 Jan 24, 2024

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jan 24, 2024

github-actions bot locked and limited conversation to collaborators Feb 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System.Numerics.Tensor test failures on CoreCLR #97297

System.Numerics.Tensor test failures on CoreCLR #97297

akoeplinger commented Jan 22, 2024 •

edited

Loading

ghost commented Jan 22, 2024

akoeplinger commented Jan 22, 2024

stephentoub commented Jan 22, 2024

pavelsavara commented Jan 22, 2024

stephentoub commented Jan 22, 2024

akoeplinger commented Jan 22, 2024

tannergooding commented Jan 22, 2024

stephentoub commented Jan 22, 2024 •

edited

Loading

tannergooding commented Jan 22, 2024

System.Numerics.Tensor test failures on CoreCLR #97297

System.Numerics.Tensor test failures on CoreCLR #97297

Comments

akoeplinger commented Jan 22, 2024 • edited Loading

ghost commented Jan 22, 2024

akoeplinger commented Jan 22, 2024

stephentoub commented Jan 22, 2024

pavelsavara commented Jan 22, 2024

stephentoub commented Jan 22, 2024

akoeplinger commented Jan 22, 2024

tannergooding commented Jan 22, 2024

stephentoub commented Jan 22, 2024 • edited Loading

tannergooding commented Jan 22, 2024

akoeplinger commented Jan 22, 2024 •

edited

Loading

stephentoub commented Jan 22, 2024 •

edited

Loading