Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few more MMAIntrinsics #19099

Merged
merged 1 commit into from
Nov 12, 2024
Merged

A few more MMAIntrinsics #19099

merged 1 commit into from
Nov 12, 2024

Conversation

bjacob
Copy link
Contributor

@bjacob bjacob commented Nov 11, 2024

Supporting some MMAIntrinsics that were omitted for no particular reason, and that seemed like they could be useful. In particular, supporting the f64 intrinsic present in CDNA2/3, adding mixed-f8-types support in CDNA3, and adding bf16 support to CDNA2 and RDNA3.

@kuhar
Copy link
Member

kuhar commented Nov 11, 2024

@bjacob can you rebase it on top of 48f6dee and see if anything needs updating? I think it shouldn't but it'd be worth checking before landing this.

@bjacob bjacob force-pushed the users/bjacob/refactor-mma-intrinsics branch from a44474a to 17dc529 Compare November 11, 2024 21:57
@bjacob bjacob force-pushed the users/bjacob/more-mma-intrinsics branch from ffc4d58 to 09d91a5 Compare November 11, 2024 21:57
@bjacob bjacob requested a review from kuhar November 11, 2024 21:59
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I assume the thumbs-up means that you rebased it and it still works)

@bjacob
Copy link
Contributor Author

bjacob commented Nov 11, 2024

(I assume the thumbs-up means that you rebased it and it still works)

Yes :-)

bjacob added a commit that referenced this pull request Nov 12, 2024
This adds some e2e matmul tests on CPU for f64 (double precision).
Motivation:
- This is technically supported by IREE.
- This was already > 50% implemented in the e2e matmul testing
framework, and the partly implemented status was not a good steady state
to be in.
- My real motivation is #19099, I'm
doing a round of AMD MFMA / data tiling improvements, thought we should
support a denser set of MFMA intrinsics, noticed this has been supported
since CDNA2 so we shouldn't be prevented from supporting it just because
our e2e matmul tests can't cover it.

Signed-off-by: Benoit Jacob <[email protected]>
bjacob added a commit that referenced this pull request Nov 12, 2024
Some code simplifications in `IREEGPUAttrs.cpp`, and some shuffling of
the MMAIntrinsic enum, making it easier to add more MMAIntrinsics in the
future (see #19099).

---------

Signed-off-by: Benoit Jacob <[email protected]>
Base automatically changed from users/bjacob/refactor-mma-intrinsics to main November 12, 2024 16:20
Signed-off-by: Benoit Jacob <[email protected]>
@bjacob bjacob force-pushed the users/bjacob/more-mma-intrinsics branch from 09d91a5 to ab2aa4c Compare November 12, 2024 16:23
@bjacob bjacob enabled auto-merge (squash) November 12, 2024 16:26
@bjacob bjacob merged commit 31e7343 into main Nov 12, 2024
39 checks passed
@bjacob bjacob deleted the users/bjacob/more-mma-intrinsics branch November 12, 2024 16:54
@bjacob bjacob mentioned this pull request Nov 12, 2024
Comment on lines +277 to +282
case MMAIntrinsic::WMMA_F32_16x16x16_BF16: {
return {bf16, bf16, f32};
}
case MMAIntrinsic::WMMA_BF16_16x16x16_BF16: {
return {bf16, bf16, bf16};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gcc build error somewhere in this function: https://github.com/iree-org/iree/actions/runs/11814370974/job/32913293915#step:4:7077

[6444/8163] Building CXX object compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/CMakeFiles/iree_compiler_Codegen_Dialect_GPU_IR_IREEGPUDialect.objects.dir/IREEGPUAttrs.cpp.o
FAILED: compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/CMakeFiles/iree_compiler_Codegen_Dialect_GPU_IR_IREEGPUDialect.objects.dir/IREEGPUAttrs.cpp.o 
/usr/local/bin/ccache /usr/bin/g++-9  -I/__w/iree/iree -I/__w/iree/iree/build-gcc -I/__w/iree/iree/third_party/llvm-project/llvm/include -I/__w/iree/iree/build-gcc/llvm-project/include -I/__w/iree/iree/third_party/llvm-project/mlir/include -I/__w/iree/iree/build-gcc/llvm-project/tools/mlir/include -I/__w/iree/iree/third_party/llvm-project/lld/include -I/__w/iree/iree/build-gcc/llvm-project/tools/lld/include -I/__w/iree/iree/compiler/src -I/__w/iree/iree/build-gcc/compiler/src -I/__w/iree/iree/runtime/src -I/__w/iree/iree/build-gcc/runtime/src -I/__w/iree/iree/build-gcc/runtime/src/iree/base/internal/flatcc -I/__w/iree/iree/compiler/bindings/c/. -I/__w/iree/iree/build-gcc/compiler/bindings/c/. -isystem /__w/iree/iree/third_party/flatcc/include -O3  -fPIC -fvisibility=hidden -fno-rtti -fno-exceptions -Wall -Werror -Wno-error=deprecated-declarations -Wno-address -Wno-address-of-packed-member -Wno-comment -Wno-format-zero-length -Wno-uninitialized -Wno-overloaded-virtual -Wno-invalid-offsetof -Wno-sign-compare -Wno-unused-function -Wno-unknown-pragmas -Wno-unused-but-set-variable -Wno-misleading-indentation -fmacro-prefix-map=/__w/iree/iree=iree -I/__w/iree/iree/third_party/flatcc/include/ -I/__w/iree/iree/third_party/flatcc/include/flatcc/reflection/ -std=gnu++17 -MD -MT compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/CMakeFiles/iree_compiler_Codegen_Dialect_GPU_IR_IREEGPUDialect.objects.dir/IREEGPUAttrs.cpp.o -MF compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/CMakeFiles/iree_compiler_Codegen_Dialect_GPU_IR_IREEGPUDialect.objects.dir/IREEGPUAttrs.cpp.o.d -o compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/CMakeFiles/iree_compiler_Codegen_Dialect_GPU_IR_IREEGPUDialect.objects.dir/IREEGPUAttrs.cpp.o -c /__w/iree/iree/compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/IREEGPUAttrs.cpp
/__w/iree/iree/compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/IREEGPUAttrs.cpp: In function 'std::tuple<mlir::Type, mlir::Type, mlir::Type> mlir::iree_compiler::IREE::GPU::getABCElementTypes(mlir::MLIRContext*, mlir::iree_compiler::IREE::GPU::MMAIntrinsic)':
/__w/iree/iree/compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/IREEGPUAttrs.cpp:293:1: error: control reaches end of non-void function [-Werror=return-type]
  293 | }
      | ^
cc1plus: all warnings being treated as errors

unhandled switch case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting! I'll fix asap (just bad combination right now of: I'm WFH, the server i've been dev on is down, and my workstation in the office is turned off).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is not a missing switch case (Clang would complain in that case) but GCC rightfully pointing out that even though we have handled all switch cases, it is still unsafe to assume that one of them is taken. See llvm/llvm-project#115345.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants