Replies: 3 comments 3 replies
-
Maybe I'm missing something, by why would we need libraries or roll our own? I assume a language (e.g., HIP) would have support for reduced precision instructions in their instructions/in libraries like rocBLAS. So what is the extra step needed for? |
Beta Was this translation helpful? Give feedback.
-
This would be part of the gem5 build, so we wouldn't want users/gem5 community to have to have ROCm installed locally*. Basically I am looking for something that would do the math in the instructions.cc file Using HIP's __half2, etc, is another potential option but it would require #include'ing the hip_fip16.hh file from HIP which would need a ROCm install
|
Beta Was this translation helpful? Give feedback.
-
There is already a small library that implements bit-accurate Arm FP support in |
Beta Was this translation helpful? Give feedback.
-
For the next-next-release (i.e., v24.0) I am attempting to bump the GPU ISA to support the new instructions up to and including MI200 and maybe MI300 if the spec comes out in the next few weeks. As part of that, there are some of float16/bfloat16 instructions that would need software support in gem5 to execute since they are not native C++ types. (See section 12.10 here: https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/instinct-mi200-cdna2-instruction-set-architecture.pdf)
There are a few options here so I wanted to start a discussion on what others think before I get too much further. This could potentially be beneficial outside the GPU, such as x86 AVX512-xyz instructions and I'm sure ARM / RISC-V folks have or are considering instructions which support reduced precision FP types.
At a high-level the options look like this:
Another question would be where to put a 3rd party library if we go that route. Traditionally we place these kind of things in ext/ or like the recent Capstone contribution we check for an install in scons (KConfig?) and optionally build it in. However, some options are C++ header-only libraries which would have to be compiled in.
Lastly, I don't want to limit this discussion to only fp16/bf16. Since gem5 is a simulator we could enable exploration of custom data types and new instructions that don't yet exist. For example, the recent Open Compute Project Microscaling Formats could be a good baseline: (https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). Some 3rd party libraries already support this.
I am not planning to commit to anything immediately, but wanted to start collecting thoughts before doing another code dump :).
Beta Was this translation helpful? Give feedback.
All reactions