Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Support inline assembly 'f' constraint for Zfinx. #112986

Merged
merged 1 commit into from
Oct 19, 2024

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Oct 18, 2024

This would allow some inline assembly code work with either F or Zfinx. This appears to match gcc behavior.

This will need to be adjust to exclude X0 after #112563.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

This would allow some inline assembly code work with either F or Zfinx. This appears to match gcc behavior.

This will need to be adjust to exclude X0 after #112563.


Full diff: https://github.com/llvm/llvm-project/pull/112986.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+36-12)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm-zdinx-constraint-r.ll (+48)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm-zfinx-constraint-r.ll (+45)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm-zhinx-constraint-r.ll (+82)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 60ac58f824ede4..63fc3dbf4fa6ae 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -20392,12 +20392,24 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
         return std::make_pair(0U, &RISCV::GPRPairRegClass);
       return std::make_pair(0U, &RISCV::GPRNoX0RegClass);
     case 'f':
-      if (Subtarget.hasStdExtZfhmin() && VT == MVT::f16)
-        return std::make_pair(0U, &RISCV::FPR16RegClass);
-      if (Subtarget.hasStdExtF() && VT == MVT::f32)
-        return std::make_pair(0U, &RISCV::FPR32RegClass);
-      if (Subtarget.hasStdExtD() && VT == MVT::f64)
-        return std::make_pair(0U, &RISCV::FPR64RegClass);
+      if (VT == MVT::f16) {
+        if (Subtarget.hasStdExtZfhmin())
+          return std::make_pair(0U, &RISCV::FPR16RegClass);
+        if (Subtarget.hasStdExtZhinxmin())
+          return std::make_pair(0U, &RISCV::GPRF16RegClass);
+      } else if (VT == MVT::f32) {
+        if (Subtarget.hasStdExtF())
+          return std::make_pair(0U, &RISCV::FPR32RegClass);
+        if (Subtarget.hasStdExtZfinx())
+          return std::make_pair(0U, &RISCV::GPRF32RegClass);
+      } else if (VT == MVT::f64) {
+        if (Subtarget.hasStdExtD())
+          return std::make_pair(0U, &RISCV::FPR64RegClass);
+        if (Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
+          return std::make_pair(0U, &RISCV::GPRPairRegClass);
+        if (Subtarget.hasStdExtZdinx() && Subtarget.is64Bit())
+          return std::make_pair(0U, &RISCV::GPRNoX0RegClass);
+      }
       break;
     default:
       break;
@@ -20440,12 +20452,24 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
     if (!VT.isVector())
       return std::make_pair(0U, &RISCV::GPRCRegClass);
   } else if (Constraint == "cf") {
-    if (Subtarget.hasStdExtZfhmin() && VT == MVT::f16)
-      return std::make_pair(0U, &RISCV::FPR16CRegClass);
-    if (Subtarget.hasStdExtF() && VT == MVT::f32)
-      return std::make_pair(0U, &RISCV::FPR32CRegClass);
-    if (Subtarget.hasStdExtD() && VT == MVT::f64)
-      return std::make_pair(0U, &RISCV::FPR64CRegClass);
+    if (VT == MVT::f16) {
+      if (Subtarget.hasStdExtZfhmin())
+        return std::make_pair(0U, &RISCV::FPR16CRegClass);
+      if (Subtarget.hasStdExtZhinxmin())
+        return std::make_pair(0U, &RISCV::GPRF16CRegClass);
+    } else if (VT == MVT::f32) {
+      if (Subtarget.hasStdExtF())
+        return std::make_pair(0U, &RISCV::FPR32CRegClass);
+      if (Subtarget.hasStdExtZfinx())
+        return std::make_pair(0U, &RISCV::GPRF32CRegClass);
+    } else if (VT == MVT::f64) {
+      if (Subtarget.hasStdExtD())
+        return std::make_pair(0U, &RISCV::FPR64CRegClass);
+      if (Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
+        return std::make_pair(0U, &RISCV::GPRPairCRegClass);
+      if (Subtarget.hasStdExtZdinx() && Subtarget.is64Bit())
+        return std::make_pair(0U, &RISCV::GPRCRegClass);
+    }
   }
 
   // Clang will correctly decode the usage of register name aliases into their
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-zdinx-constraint-r.ll b/llvm/test/CodeGen/RISCV/inline-asm-zdinx-constraint-r.ll
index 15729ee2bc61e9..57be0e5e4199ac 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-zdinx-constraint-r.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-zdinx-constraint-r.ll
@@ -90,3 +90,51 @@ define double @constraint_double_abi_name(double %a) nounwind {
   %2 = tail call double asm "fadd.d $0, $1, $2", "={t1},{a0},{s0}"(double %a, double %1)
   ret double %2
 }
+
+define double @constraint_f_double(double %a) nounwind {
+; RV32FINX-LABEL: constraint_f_double:
+; RV32FINX:       # %bb.0:
+; RV32FINX-NEXT:    lui a2, %hi(gd)
+; RV32FINX-NEXT:    lw a3, %lo(gd+4)(a2)
+; RV32FINX-NEXT:    lw a2, %lo(gd)(a2)
+; RV32FINX-NEXT:    #APP
+; RV32FINX-NEXT:    fadd.d a0, a0, a2
+; RV32FINX-NEXT:    #NO_APP
+; RV32FINX-NEXT:    ret
+;
+; RV64FINX-LABEL: constraint_f_double:
+; RV64FINX:       # %bb.0:
+; RV64FINX-NEXT:    lui a1, %hi(gd)
+; RV64FINX-NEXT:    ld a1, %lo(gd)(a1)
+; RV64FINX-NEXT:    #APP
+; RV64FINX-NEXT:    fadd.d a0, a0, a1
+; RV64FINX-NEXT:    #NO_APP
+; RV64FINX-NEXT:    ret
+  %1 = load double, ptr @gd
+  %2 = tail call double asm "fadd.d $0, $1, $2", "=f,f,f"(double %a, double %1)
+  ret double %2
+}
+
+define double @constraint_cf_double(double %a) nounwind {
+; RV32FINX-LABEL: constraint_cf_double:
+; RV32FINX:       # %bb.0:
+; RV32FINX-NEXT:    lui a2, %hi(gd)
+; RV32FINX-NEXT:    lw a3, %lo(gd+4)(a2)
+; RV32FINX-NEXT:    lw a2, %lo(gd)(a2)
+; RV32FINX-NEXT:    #APP
+; RV32FINX-NEXT:    fadd.d a0, a0, a2
+; RV32FINX-NEXT:    #NO_APP
+; RV32FINX-NEXT:    ret
+;
+; RV64FINX-LABEL: constraint_cf_double:
+; RV64FINX:       # %bb.0:
+; RV64FINX-NEXT:    lui a1, %hi(gd)
+; RV64FINX-NEXT:    ld a1, %lo(gd)(a1)
+; RV64FINX-NEXT:    #APP
+; RV64FINX-NEXT:    fadd.d a0, a0, a1
+; RV64FINX-NEXT:    #NO_APP
+; RV64FINX-NEXT:    ret
+  %1 = load double, ptr @gd
+  %2 = tail call double asm "fadd.d $0, $1, $2", "=^cf,^cf,^cf"(double %a, double %1)
+  ret double %2
+}
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-zfinx-constraint-r.ll b/llvm/test/CodeGen/RISCV/inline-asm-zfinx-constraint-r.ll
index a8d3515fe1890e..1c0de6c3f16121 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-zfinx-constraint-r.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-zfinx-constraint-r.ll
@@ -87,3 +87,48 @@ define float @constraint_float_abi_name(float %a) nounwind {
   ret float %2
 }
 
+define float @constraint_f_float(float %a) nounwind {
+; RV32FINX-LABEL: constraint_f_float:
+; RV32FINX:       # %bb.0:
+; RV32FINX-NEXT:    lui a1, %hi(gf)
+; RV32FINX-NEXT:    lw a1, %lo(gf)(a1)
+; RV32FINX-NEXT:    #APP
+; RV32FINX-NEXT:    fadd.s a0, a0, a1
+; RV32FINX-NEXT:    #NO_APP
+; RV32FINX-NEXT:    ret
+;
+; RV64FINX-LABEL: constraint_f_float:
+; RV64FINX:       # %bb.0:
+; RV64FINX-NEXT:    lui a1, %hi(gf)
+; RV64FINX-NEXT:    lw a1, %lo(gf)(a1)
+; RV64FINX-NEXT:    #APP
+; RV64FINX-NEXT:    fadd.s a0, a0, a1
+; RV64FINX-NEXT:    #NO_APP
+; RV64FINX-NEXT:    ret
+  %1 = load float, ptr @gf
+  %2 = tail call float asm "fadd.s $0, $1, $2", "=f,f,f"(float %a, float %1)
+  ret float %2
+}
+
+define float @constraint_cf_float(float %a) nounwind {
+; RV32FINX-LABEL: constraint_cf_float:
+; RV32FINX:       # %bb.0:
+; RV32FINX-NEXT:    lui a1, %hi(gf)
+; RV32FINX-NEXT:    lw a1, %lo(gf)(a1)
+; RV32FINX-NEXT:    #APP
+; RV32FINX-NEXT:    fadd.s a0, a0, a1
+; RV32FINX-NEXT:    #NO_APP
+; RV32FINX-NEXT:    ret
+;
+; RV64FINX-LABEL: constraint_cf_float:
+; RV64FINX:       # %bb.0:
+; RV64FINX-NEXT:    lui a1, %hi(gf)
+; RV64FINX-NEXT:    lw a1, %lo(gf)(a1)
+; RV64FINX-NEXT:    #APP
+; RV64FINX-NEXT:    fadd.s a0, a0, a1
+; RV64FINX-NEXT:    #NO_APP
+; RV64FINX-NEXT:    ret
+  %1 = load float, ptr @gf
+  %2 = tail call float asm "fadd.s $0, $1, $2", "=^cf,cf,cf"(float %a, float %1)
+  ret float %2
+}
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-zhinx-constraint-r.ll b/llvm/test/CodeGen/RISCV/inline-asm-zhinx-constraint-r.ll
index f9707c6c8995dc..086d2a1d6f3b2f 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-zhinx-constraint-r.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-zhinx-constraint-r.ll
@@ -156,3 +156,85 @@ define half @constraint_half_abi_name(half %a) nounwind {
   %2 = tail call half asm "fadd.s $0, $1, $2", "={t0},{a0},{s0}"(half %a, half %1)
   ret half %2
 }
+
+define half @constraint_f_half(half %a) nounwind {
+; RV32ZHINX-LABEL: constraint_f_half:
+; RV32ZHINX:       # %bb.0:
+; RV32ZHINX-NEXT:    lui a1, %hi(gh)
+; RV32ZHINX-NEXT:    lh a1, %lo(gh)(a1)
+; RV32ZHINX-NEXT:    #APP
+; RV32ZHINX-NEXT:    fadd.h a0, a0, a1
+; RV32ZHINX-NEXT:    #NO_APP
+; RV32ZHINX-NEXT:    ret
+;
+; RV64ZHINX-LABEL: constraint_f_half:
+; RV64ZHINX:       # %bb.0:
+; RV64ZHINX-NEXT:    lui a1, %hi(gh)
+; RV64ZHINX-NEXT:    lh a1, %lo(gh)(a1)
+; RV64ZHINX-NEXT:    #APP
+; RV64ZHINX-NEXT:    fadd.h a0, a0, a1
+; RV64ZHINX-NEXT:    #NO_APP
+; RV64ZHINX-NEXT:    ret
+;
+; RV32DINXZHINX-LABEL: constraint_f_half:
+; RV32DINXZHINX:       # %bb.0:
+; RV32DINXZHINX-NEXT:    lui a1, %hi(gh)
+; RV32DINXZHINX-NEXT:    lh a1, %lo(gh)(a1)
+; RV32DINXZHINX-NEXT:    #APP
+; RV32DINXZHINX-NEXT:    fadd.h a0, a0, a1
+; RV32DINXZHINX-NEXT:    #NO_APP
+; RV32DINXZHINX-NEXT:    ret
+;
+; RV64DINXZHINX-LABEL: constraint_f_half:
+; RV64DINXZHINX:       # %bb.0:
+; RV64DINXZHINX-NEXT:    lui a1, %hi(gh)
+; RV64DINXZHINX-NEXT:    lh a1, %lo(gh)(a1)
+; RV64DINXZHINX-NEXT:    #APP
+; RV64DINXZHINX-NEXT:    fadd.h a0, a0, a1
+; RV64DINXZHINX-NEXT:    #NO_APP
+; RV64DINXZHINX-NEXT:    ret
+  %1 = load half, ptr @gh
+  %2 = tail call half asm "fadd.h $0, $1, $2", "=f,f,f"(half %a, half %1)
+  ret half %2
+}
+
+define half @constraint_cf_half(half %a) nounwind {
+; RV32ZHINX-LABEL: constraint_cf_half:
+; RV32ZHINX:       # %bb.0:
+; RV32ZHINX-NEXT:    lui a1, %hi(gh)
+; RV32ZHINX-NEXT:    lh a1, %lo(gh)(a1)
+; RV32ZHINX-NEXT:    #APP
+; RV32ZHINX-NEXT:    fadd.h a0, a0, a1
+; RV32ZHINX-NEXT:    #NO_APP
+; RV32ZHINX-NEXT:    ret
+;
+; RV64ZHINX-LABEL: constraint_cf_half:
+; RV64ZHINX:       # %bb.0:
+; RV64ZHINX-NEXT:    lui a1, %hi(gh)
+; RV64ZHINX-NEXT:    lh a1, %lo(gh)(a1)
+; RV64ZHINX-NEXT:    #APP
+; RV64ZHINX-NEXT:    fadd.h a0, a0, a1
+; RV64ZHINX-NEXT:    #NO_APP
+; RV64ZHINX-NEXT:    ret
+;
+; RV32DINXZHINX-LABEL: constraint_cf_half:
+; RV32DINXZHINX:       # %bb.0:
+; RV32DINXZHINX-NEXT:    lui a1, %hi(gh)
+; RV32DINXZHINX-NEXT:    lh a1, %lo(gh)(a1)
+; RV32DINXZHINX-NEXT:    #APP
+; RV32DINXZHINX-NEXT:    fadd.h a0, a0, a1
+; RV32DINXZHINX-NEXT:    #NO_APP
+; RV32DINXZHINX-NEXT:    ret
+;
+; RV64DINXZHINX-LABEL: constraint_cf_half:
+; RV64DINXZHINX:       # %bb.0:
+; RV64DINXZHINX-NEXT:    lui a1, %hi(gh)
+; RV64DINXZHINX-NEXT:    lh a1, %lo(gh)(a1)
+; RV64DINXZHINX-NEXT:    #APP
+; RV64DINXZHINX-NEXT:    fadd.h a0, a0, a1
+; RV64DINXZHINX-NEXT:    #NO_APP
+; RV64DINXZHINX-NEXT:    ret
+  %1 = load half, ptr @gh
+  %2 = tail call half asm "fadd.h $0, $1, $2", "=^cf,^cf,^cf"(half %a, half %1)
+  ret half %2
+}

@lenary
Copy link
Member

lenary commented Oct 18, 2024

I've just merged that other PR, so you should be able to do a rebase now.

This would allow some inline assembly code work with either F or Zfinx.
This appears to match gcc behavior.

This will need to be adjust to exclude X0 after llvm#112563.
@topperc topperc merged commit 1bc1a79 into llvm:main Oct 19, 2024
6 of 8 checks passed
@topperc topperc deleted the pr/f-constraint-zfinx branch October 19, 2024 01:17
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 19, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/7296

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libomp :: tasking/issue-94260-2.c' FAILED ********************
Exit Code: -11

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp   -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/openmp/runtime/test -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -fno-omit-frame-pointer -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/openmp/runtime/test/ompt /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic && /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/openmp/runtime/test -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/openmp/runtime/test/ompt /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11

--

********************

Step 10 (Add check check-offload) failure: 1200 seconds without output running [b'ninja', b'-j 32', b'check-offload'], attempting to kill
...
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug47654.cpp (866 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug49779.cpp (867 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/test_libc.cpp (868 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug50022.cpp (869 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/wtime.c (870 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu :: offloading/bug49021.cpp (871 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu :: offloading/std_complex_arithmetic.cpp (872 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/complex_reduction.cpp (873 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/std_complex_arithmetic.cpp (874 of 879)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug49021.cpp (875 of 879)
command timed out: 1200 seconds without output running [b'ninja', b'-j 32', b'check-offload'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1237.763661

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants