-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LAPACK testing failures on Sapphire Rapids (55 other COMPLEX) #4282
Comments
The INFO returned here appears to be the number of eigenvectors which failed to converge. As far as the BLAS kernels are concerned, SapphireRapids is currently CooperLake which is SkylakeX with added BFLOAT16 functions. So the most likely source of numerical differences will be in target-specific code generation based on the |
Turns out the actual cause of the difference was the failure to enable the SkylakeX CASUM "microkernel" for CooperLake and SapphireRapids when support for either was added. (This also implies a loss of precision in the plain C fallback function of the CASUM kernel that I have not looked into yet, but no target configuration is supposed to use it) |
This kernel is only used on Skylake+ if the kernel with AVX512 intrinsics can't be used, but used the variable x1 incorrectly in the tail end of the loop, as it is still at the initial value instead of where x points to. This caused 55 "other error"s in the LAPACK tests (OpenMathLib#4282) This change makes casum.c as similar as possible as zasum.c, because zasum.c does this correctly.
In 0.3.24 (the first to actually support the Sapphire Rapids after #4002) I see consistently 55 "other" failures in the COMPLEX part of the LAPACK tests:
This happens for GCC 11.2, 11.3, 12.2, 12.3 and also (with backports of #4002) since at least 0.3.20.
The same happens for
TARGET=SAPPHIRERAPIDS
without the patch so likely is also an issue in 0.3.19.A workaround seems to be
TARGET=SKYLAKEX
From the output I only found:
The text was updated successfully, but these errors were encountered: