-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpu: aarch64: Expand brgemm aarch64 unsupported cases handling mechanism #2099
cpu: aarch64: Expand brgemm aarch64 unsupported cases handling mechanism #2099
Conversation
Remove the asserts that cause segfaults in the test and handle unsupported cases based on oneDNN status type. As well remove the forgotten asserts in brgemm_matmul.cpp and modify acl_deconvolution.hpp to make op/f32/deconv_bwd_d test pass.
const dnn_mem_t &mem_dt = args.find(arg); | ||
const dnn_mem_t &mem_fp = args_.find(arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change needed? What's the issue with auto
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dzarukin Hi Dmitry, thanks for having a look into this. I have replaced auto with explicit data types for improved readability, consistency, and type safety. When debugging it was particularly useful to have the data type specified and I did go an extra mile checking that the function returns the same type. Should not be a problem but I am happy to review it if it is a deal breaker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve been working on this the past few days, and all tests for the Graph API should run fine. There are a few edge cases that fail but I’ll detail in an issue soon and shall be fixed in the foreseeable future.
Thanks for the approval!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you split the change to acl_deconvolution in a separate commit as it seems to fix a different issue (unrelated to brgemm)?
@mgouicem Hi Mourad, thanks for having a look into this. I understand your opinion but why I have added it to this PR is because |
We are not approvers for graph component. @TaoLv could you help? |
The change that triggers review from Graph component is discussed here. Considering that it's purely stylistical it may make sense to just drop it. |
Just to clarify on this topic. This is needed due to the segfaulting behaviour of the test. These are on high priority because segfaults are a security risk. The patch that has introduced this is not coming from us and we will take further action once this change has been submitted. We already have an Issue prepared. We are promptly on a clear path to minimise our tolerance for any tests that do not guard properly any unsupported cases or any PRs that have slipped trough our fingers in the past. |
@dzarukin, @theComputeKid, @mgouicem, @Radu2k after this PR is merged the matmul calls are going to gemm:jit:f32, instead of going to brg:sve_256 on Graviton 3. This would slow down matmuls to a great extent on aarch64 when OneDNN is not built with ACL. |
Thanks for letting us know. @Radu2k can you please take a look? |
@Shreyas-fuj : I can see that the issue is that the conditions of the assert were flipped:
These two are not equivalent, but flipped. I'll fix. Thanks once again for picking this up. |
@Shreyas-fuj Thank you for bringing the issue to our attention. This PR is the fix for the observed behaviour |
@theComputeKid , @Ryo-not-rio , thanks for the fix! |
Remove the asserts that cause segfaults in the test and handle unsupported cases based on oneDNN status type. As well remove the forgotten asserts in brgemm_matmul.cpp and modify acl_deconvolution.hpp to make op/f32/deconv_bwd_d test pass.
Description
This replaces the asserts in brgemm that make
test_benchdnn_modeC_graph_ci_cpu
segfault on AArch64 by replacing the asserts withstatus_t
type in oneDNN. As well, the segfault of--graph --skip-impl=ref --case=op/f32/deconv.json
test when built with gcc-10 and g++-10, which is part of the graph test batch.Checklist
General
make test
andmake test_benchdnn_*
) pass locally for each commit?Bug fixes