-
-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accuracy problem on Apple M1 and Intel(R) UHD Graphics 770 #542
Comments
Update: OS: Ubuntu 22.04.2 LTS Also Cgemm results are incorrect on
Note that reverting tuning results for the platform does gives accurate results again. |
Thanks for reporting this. However, I see you wrote your own tests, but CLBlast already contains a large and sophisticated test suite. Can you run the relevant (original) CLBlast tests on your hardware for me and see if they also fail? If they don't fail, can you modify them to include the large matrices that you test in your own tests in the original CLBlast tests and re-run them? |
Three tests are failed on M1 in the Original code without modification:
|
Many
|
Thank you for running the tests. Perhaps this could be related to #533. Since I don't have the same devices to test on as you do, I simply modified But first I'll try to solve the Intel UHD 770 issue. If I simply use the Intel GPU default parameters (in |
Let me know if I can help with the M1 issue. Do we have options to build and use this library without tuning results? |
Some initial results: when I revert #341, then the issue seems resolved, at least for a few tests I did. I'll do some more investigation and re-read the original #340 issue again, and will keep you updated.
Well it depends on what you mean with 'without tuning results', because it needs to use some set of parameters. What you can do is modify |
Thank you for the quick update!
We can add compile definition (e.g. # CMakeLists.txt
option(WITH_TUNING_RESULTS "" ON)
if(WITH_TUNING_RESULTS)
add_compile_definitions(-DHAVE_TUNING_RESULTS)
endif() Then we can use |
This PR #543 likely solves the issue you reported on the Intel UHD 770. If you could try it out to confirm, that would be great! The issue with the Apple M1 seems unrelated, since that device doesn't use this |
I will have a try later in this week. Thank you for the quick fix! |
@CNugteren I can confirm that #543 fixes the accuracy problem on the Intel UHD 770. As for the Apple M1 accuracy problem, let me extend existing test in this repository to give you more results. |
@CNugteren I can also confirm that with #543 fixes the accuracy problem on Apple M1. I guess we are done with this issue. Thank you for the quick response and updates and patches! |
CLBlast on Apple M1 gives incorrect Sgemm results with sqaure mat of scale >= 1152.
macOS version: 14.4.1
Reproducer is available at https://github.com/fengyuentau/test-clblast.
Test results are shown at https://github.com/fengyuentau/test-clblast?tab=readme-ov-file#results, which are
I tried to comment out tuning results for Apple M1 and it can give correct resutls this time. Would you accept a patch to revert tuning results for Apple M1?
The text was updated successfully, but these errors were encountered: