Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV][ISel] Ensure 'in X' Constraints prevent X0 #112563

Merged
merged 3 commits into from
Oct 18, 2024

Conversation

lenary
Copy link
Member

@lenary lenary commented Oct 16, 2024

I'm not sure if this fix is required, but I've written the patch anyway. This does not cause test changes, but we haven't got tests that try to use all 32 registers in inline assembly.

Broadly, for GPRs, we made the explicit choice that r constraints would never attempt to use x0, because x0 isn't really usable like the other GPRs. I believe the same thing applies to Zhinx, Zfinx and Zdinx because they should not be allocating operands to x0 either, so this patch introduces new NoX0 classes for GPRF16 and GPRF32 registers, and uses them with inline assembly. There is also a GPRPairNoX0 for the Zdinx case on rv32, avoiding use of the x0 pair which has different behaviour to the other GPR pairs.


This is stacked on #112561, only review the last commit in this PR.

@lenary lenary requested a review from topperc October 16, 2024 15:09
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 16, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 16, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-risc-v

Author: Sam Elliott (lenary)

Changes

I'm not sure if this fix is required, but I've written the patch anyway. This does not cause test changes, but we haven't got tests that try to use all 32 registers in inline assembly.

Broadly, for GPRs, we made the explicit choice that r constraints would never attempt to use x0, because x0 isn't really usable like the other GPRs. I believe the same thing applies to Zhinx, Zfinx and Zdinx because they should not be allocating operands to x0 either, so this patch introduces new NoX0 classes for GPRF16 and GPRF32 registers, and uses them with inline assembly. There is also a GPRPairNoX0 for the Zdinx case on rv32, avoiding use of the x0 pair which has different behaviour to the other GPR pairs.


This is stacked on #112561, only review the last commit in this PR.


Patch is 48.23 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/112563.diff

17 Files Affected:

  • (modified) clang/lib/Basic/Targets/RISCV.cpp (+10)
  • (modified) clang/test/CodeGen/RISCV/riscv-inline-asm.c (+39-1)
  • (modified) llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp (+9)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+21-3)
  • (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.td (+19-3)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll (+33)
  • (added) llvm/test/CodeGen/RISCV/inline-asm-d-modifier-N.ll (+109)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm-f-constraint-f.ll (+27-1)
  • (added) llvm/test/CodeGen/RISCV/inline-asm-f-modifier-N.ll (+96)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm-invalid.ll (+20)
  • (added) llvm/test/CodeGen/RISCV/inline-asm-zdinx-constraint-r.ll (+92)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm-zfh-constraint-f.ll (+41)
  • (added) llvm/test/CodeGen/RISCV/inline-asm-zfh-modifier-N.ll (+157)
  • (added) llvm/test/CodeGen/RISCV/inline-asm-zfinx-constraint-r.ll (+89)
  • (added) llvm/test/CodeGen/RISCV/inline-asm-zhinx-constraint-r.ll (+158)
  • (modified) llvm/test/CodeGen/RISCV/inline-asm.ll (+66)
  • (modified) llvm/test/CodeGen/RISCV/zdinx-asm-constraint.ll (+61)
diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index 870f0f38bc3057..eaaba7642bd7b2 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -100,6 +100,14 @@ bool RISCVTargetInfo::validateAsmConstraint(
   case 'S': // A symbol or label reference with a constant offset
     Info.setAllowsRegister();
     return true;
+  case 'c':
+    // A RVC register - GPR or FPR
+    if (Name[1] == 'r' || Name[1] == 'f') {
+      Info.setAllowsRegister();
+      Name += 1;
+      return true;
+    }
+    return false;
   case 'v':
     // A vector register.
     if (Name[1] == 'r' || Name[1] == 'd' || Name[1] == 'm') {
@@ -114,6 +122,8 @@ bool RISCVTargetInfo::validateAsmConstraint(
 std::string RISCVTargetInfo::convertConstraint(const char *&Constraint) const {
   std::string R;
   switch (*Constraint) {
+  // c* and v* are two-letter constraints on RISC-V.
+  case 'c':
   case 'v':
     R = std::string("^") + std::string(Constraint, 2);
     Constraint += 1;
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm.c b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
index fa0bf6aa6aa471..75b91d3c497c50 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
@@ -3,7 +3,35 @@
 // RUN: %clang_cc1 -triple riscv64 -O2 -emit-llvm %s -o - \
 // RUN:     | FileCheck %s
 
-// Test RISC-V specific inline assembly constraints.
+// Test RISC-V specific inline assembly constraints and modifiers.
+
+long test_r(long x) {
+// CHECK-LABEL: define{{.*}} {{i64|i32}} @test_r(
+// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r"({{i64|i32}} %{{.*}})
+  long ret;
+  asm volatile ("" : "=r"(ret) : "r"(x));
+// CHECK: call {{i64|i32}} asm sideeffect "", "=r,r"({{i64|i32}} %{{.*}})
+  asm volatile ("" : "=r"(ret) : "r"(x));
+  return ret;
+}
+
+long test_cr(long x) {
+// CHECK-LABEL: define{{.*}} {{i64|i32}} @test_cr(
+// CHECK: call {{i64|i32}} asm sideeffect "", "=^cr,^cr"({{i64|i32}} %{{.*}})
+  long ret;
+  asm volatile ("" : "=cr"(ret) : "cr"(x));
+  return ret;
+}
+
+float cf;
+double cd;
+void test_cf(float f, double d) {
+// CHECK-LABEL: define{{.*}} void @test_cf(
+// CHECK: call float asm sideeffect "", "=^cf,^cf"(float %{{.*}})
+  asm volatile("" : "=cf"(cf) : "cf"(f));
+// CHECK: call double asm sideeffect "", "=^cf,^cf"(double %{{.*}})
+  asm volatile("" : "=cf"(cd) : "cf"(d));
+}
 
 void test_I(void) {
 // CHECK-LABEL: define{{.*}} void @test_I()
@@ -58,3 +86,13 @@ void test_s(void) {
 
   asm("// %0 %1 %2" :: "S"(&var), "S"(&arr[1][1]), "S"(test_s));
 }
+
+// CHECK-LABEL: test_modifiers(
+// CHECK: call void asm sideeffect "// ${0:i} ${1:i}", "r,r"({{i32|i64}} %val, i32 37)
+// CHECK: call void asm sideeffect "// ${0:z} ${1:z}", "i,i"(i32 0, i32 1)
+// CHECK: call void asm sideeffect "// ${0:N}", "r"({{i32|i64}} %val)
+void test_modifiers(long val) {
+  asm volatile("// %i0 %i1" :: "r"(val), "r"(37));
+  asm volatile("// %z0 %z1" :: "i"(0), "i"(1));
+  asm volatile("// %N0" :: "r"(val));
+}
diff --git a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
index 5ad09ae7290fc5..adca0cccc5e484 100644
--- a/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
+++ b/llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
@@ -19,6 +19,7 @@
 #include "RISCV.h"
 #include "RISCVConstantPoolValue.h"
 #include "RISCVMachineFunctionInfo.h"
+#include "RISCVRegisterInfo.h"
 #include "RISCVTargetMachine.h"
 #include "TargetInfo/RISCVTargetInfo.h"
 #include "llvm/ADT/APInt.h"
@@ -348,6 +349,14 @@ bool RISCVAsmPrinter::PrintAsmOperand(const MachineInstr *MI, unsigned OpNo,
       if (!MO.isReg())
         OS << 'i';
       return false;
+    case 'N': // Print the register encoding as an integer (0-31, or 0-7 when
+              // the constraint was 'c*')
+      if (!MO.isReg())
+        return true;
+
+      const RISCVRegisterInfo *TRI = STI->getRegisterInfo();
+      OS << TRI->getEncodingValue(MO.getReg());
+      return false;
     }
   }
 
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index cde690793f0702..ad777745fda313 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -20229,6 +20229,8 @@ RISCVTargetLowering::getConstraintType(StringRef Constraint) const {
   } else {
     if (Constraint == "vr" || Constraint == "vd" || Constraint == "vm")
       return C_RegisterClass;
+    if (Constraint == "cr" || Constraint == "cf")
+      return C_RegisterClass;
   }
   return TargetLowering::getConstraintType(Constraint);
 }
@@ -20246,11 +20248,11 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
       if (VT.isVector())
         break;
       if (VT == MVT::f16 && Subtarget.hasStdExtZhinxmin())
-        return std::make_pair(0U, &RISCV::GPRF16RegClass);
+        return std::make_pair(0U, &RISCV::GPRF16NoX0RegClass);
       if (VT == MVT::f32 && Subtarget.hasStdExtZfinx())
-        return std::make_pair(0U, &RISCV::GPRF32RegClass);
+        return std::make_pair(0U, &RISCV::GPRF32NoX0RegClass);
       if (VT == MVT::f64 && Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
-        return std::make_pair(0U, &RISCV::GPRPairRegClass);
+        return std::make_pair(0U, &RISCV::GPRPairNoX0RegClass);
       return std::make_pair(0U, &RISCV::GPRNoX0RegClass);
     case 'f':
       if (Subtarget.hasStdExtZfhmin() && VT == MVT::f16)
@@ -20291,6 +20293,22 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
   } else if (Constraint == "vm") {
     if (TRI->isTypeLegalForClass(RISCV::VMV0RegClass, VT.SimpleTy))
       return std::make_pair(0U, &RISCV::VMV0RegClass);
+  } else if (Constraint == "cr") {
+    if (VT == MVT::f16 && Subtarget.hasStdExtZhinxmin())
+      return std::make_pair(0U, &RISCV::GPRF16CRegClass);
+    if (VT == MVT::f32 && Subtarget.hasStdExtZfinx())
+      return std::make_pair(0U, &RISCV::GPRF32CRegClass);
+    if (VT == MVT::f64 && Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
+      return std::make_pair(0U, &RISCV::GPRPairCRegClass);
+    if (!VT.isVector())
+      return std::make_pair(0U, &RISCV::GPRCRegClass);
+  } else if (Constraint == "cf") {
+    if (Subtarget.hasStdExtZfhmin() && VT == MVT::f16)
+      return std::make_pair(0U, &RISCV::FPR16CRegClass);
+    if (Subtarget.hasStdExtF() && VT == MVT::f32)
+      return std::make_pair(0U, &RISCV::FPR32CRegClass);
+    if (Subtarget.hasStdExtD() && VT == MVT::f64)
+      return std::make_pair(0U, &RISCV::FPR64CRegClass);
   }
 
   // Clang will correctly decode the usage of register name aliases into their
diff --git a/llvm/lib/Target/RISCV/RISCVRegisterInfo.td b/llvm/lib/Target/RISCV/RISCVRegisterInfo.td
index 33363aa8b71830..685f04213afa86 100644
--- a/llvm/lib/Target/RISCV/RISCVRegisterInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVRegisterInfo.td
@@ -338,6 +338,11 @@ def FPR16 : RISCVRegisterClass<[f16, bf16], 16, (add
     (sequence "F%u_H", 18, 27)  // fs2-fs11
 )>;
 
+def FPR16C : RISCVRegisterClass<[f16, bf16], 16, (add
+    (sequence "F%u_H", 15, 10),
+    (sequence "F%u_H", 8, 9)
+)>;
+
 def FPR32 : RISCVRegisterClass<[f32], 32, (add
     (sequence "F%u_F", 15, 10),
     (sequence "F%u_F", 0, 7),
@@ -656,6 +661,7 @@ def GPRF16 : RISCVRegisterClass<[f16], 16, (add (sequence "X%u_H", 10, 17),
                                                 (sequence "X%u_H", 0, 4))>;
 def GPRF16C : RISCVRegisterClass<[f16], 16, (add (sequence "X%u_H", 10, 15),
                                                  (sequence "X%u_H", 8, 9))>;
+def GPRF16NoX0 : RISCVRegisterClass<[f16], 16, (sub GPRF16, X0_H)>;
 
 def GPRF32 : RISCVRegisterClass<[f32], 32, (add (sequence "X%u_W", 10, 17),
                                                 (sequence "X%u_W", 5, 7),
@@ -667,6 +673,10 @@ def GPRF32C : RISCVRegisterClass<[f32], 32, (add (sequence "X%u_W", 10, 15),
                                                  (sequence "X%u_W", 8, 9))>;
 def GPRF32NoX0 : RISCVRegisterClass<[f32], 32, (sub GPRF32, X0_W)>;
 
+def XLenPairRI : RegInfoByHwMode<
+      [RV32,                RV64],
+      [RegInfo<64, 64, 32>, RegInfo<128, 128, 64>]>;
+
 // Dummy zero register for use in the register pair containing X0 (as X1 is
 // not read to or written when the X0 register pair is used).
 def DUMMY_REG_PAIR_WITH_X0 : RISCVReg<0, "0">;
@@ -698,9 +708,8 @@ let RegAltNameIndices = [ABIRegAltName] in {
   }
 }
 
-let RegInfos = RegInfoByHwMode<[RV32, RV64],
-                               [RegInfo<64, 64, 32>, RegInfo<128, 128, 64>]>,
-    DecoderMethod = "DecodeGPRPairRegisterClass" in
+let RegInfos = XLenPairRI,
+    DecoderMethod = "DecodeGPRPairRegisterClass" in {
 def GPRPair : RISCVRegisterClass<[XLenPairFVT], 64, (add
     X10_X11, X12_X13, X14_X15, X16_X17,
     X6_X7,
@@ -710,6 +719,13 @@ def GPRPair : RISCVRegisterClass<[XLenPairFVT], 64, (add
     X0_Pair, X2_X3, X4_X5
 )>;
 
+def GPRPairC : RISCVRegisterClass<[XLenPairFVT], 64, (add
+  X10_X11, X12_X13, X14_X15, X8_X9
+)>;
+
+def GPRPairNoX0 : RISCVRegisterClass<[XLenPairFVT], 64, (sub GPRPair, X0_Pair)>;
+} // let RegInfos = XLenPairRI, DecoderMethod = "DecodeGPRPairRegisterClass"
+
 // The register class is added for inline assembly for vector mask types.
 def VM : VReg<VMaskVTs, (add VR), 1>;
 
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll b/llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll
index c480ba800c6904..08e91736582012 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-d-constraint-f.ll
@@ -39,6 +39,39 @@ define double @constraint_f_double(double %a) nounwind {
   ret double %2
 }
 
+define double @constraint_cf_double(double %a) nounwind {
+; RV32F-LABEL: constraint_cf_double:
+; RV32F:       # %bb.0:
+; RV32F-NEXT:    addi sp, sp, -16
+; RV32F-NEXT:    sw a0, 8(sp)
+; RV32F-NEXT:    sw a1, 12(sp)
+; RV32F-NEXT:    fld fa5, 8(sp)
+; RV32F-NEXT:    lui a0, %hi(gd)
+; RV32F-NEXT:    fld fa4, %lo(gd)(a0)
+; RV32F-NEXT:    #APP
+; RV32F-NEXT:    fadd.d fa5, fa5, fa4
+; RV32F-NEXT:    #NO_APP
+; RV32F-NEXT:    fsd fa5, 8(sp)
+; RV32F-NEXT:    lw a0, 8(sp)
+; RV32F-NEXT:    lw a1, 12(sp)
+; RV32F-NEXT:    addi sp, sp, 16
+; RV32F-NEXT:    ret
+;
+; RV64F-LABEL: constraint_cf_double:
+; RV64F:       # %bb.0:
+; RV64F-NEXT:    lui a1, %hi(gd)
+; RV64F-NEXT:    fld fa5, %lo(gd)(a1)
+; RV64F-NEXT:    fmv.d.x fa4, a0
+; RV64F-NEXT:    #APP
+; RV64F-NEXT:    fadd.d fa5, fa4, fa5
+; RV64F-NEXT:    #NO_APP
+; RV64F-NEXT:    fmv.x.d a0, fa5
+; RV64F-NEXT:    ret
+  %1 = load double, ptr @gd
+  %2 = tail call double asm "fadd.d $0, $1, $2", "=^cf,^cf,^cf"(double %a, double %1)
+  ret double %2
+}
+
 define double @constraint_f_double_abi_name(double %a) nounwind {
 ; RV32F-LABEL: constraint_f_double_abi_name:
 ; RV32F:       # %bb.0:
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-d-modifier-N.ll b/llvm/test/CodeGen/RISCV/inline-asm-d-modifier-N.ll
new file mode 100644
index 00000000000000..581cf8e3bf3c9e
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/inline-asm-d-modifier-N.ll
@@ -0,0 +1,109 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv32 -mattr=+d -target-abi=ilp32 -verify-machineinstrs -no-integrated-as < %s \
+; RUN:   | FileCheck -check-prefix=RV32F %s
+; RUN: llc -mtriple=riscv64 -mattr=+d -target-abi=lp64 -verify-machineinstrs -no-integrated-as < %s \
+; RUN:   | FileCheck -check-prefix=RV64F %s
+
+;; `.insn 0x4, 0x02000053 | (${0:N} << 7) | (${1:N} << 15) | (${2:N} << 20)` is
+;; the raw encoding for `fadd.d`
+
+@gd = external global double
+
+define double @constraint_f_double(double %a) nounwind {
+; RV32F-LABEL: constraint_f_double:
+; RV32F:       # %bb.0:
+; RV32F-NEXT:    addi sp, sp, -16
+; RV32F-NEXT:    sw a0, 8(sp)
+; RV32F-NEXT:    sw a1, 12(sp)
+; RV32F-NEXT:    fld fa5, 8(sp)
+; RV32F-NEXT:    lui a0, %hi(gd)
+; RV32F-NEXT:    fld fa4, %lo(gd)(a0)
+; RV32F-NEXT:    #APP
+; RV32F-NEXT:    .insn 0x4, 0x02000053 | (15 << 7) | (15 << 15) | (14 << 20)
+; RV32F-NEXT:    #NO_APP
+; RV32F-NEXT:    fsd fa5, 8(sp)
+; RV32F-NEXT:    lw a0, 8(sp)
+; RV32F-NEXT:    lw a1, 12(sp)
+; RV32F-NEXT:    addi sp, sp, 16
+; RV32F-NEXT:    ret
+;
+; RV64F-LABEL: constraint_f_double:
+; RV64F:       # %bb.0:
+; RV64F-NEXT:    lui a1, %hi(gd)
+; RV64F-NEXT:    fld fa5, %lo(gd)(a1)
+; RV64F-NEXT:    fmv.d.x fa4, a0
+; RV64F-NEXT:    #APP
+; RV64F-NEXT:    .insn 0x4, 0x02000053 | (15 << 7) | (14 << 15) | (15 << 20)
+; RV64F-NEXT:    #NO_APP
+; RV64F-NEXT:    fmv.x.d a0, fa5
+; RV64F-NEXT:    ret
+  %1 = load double, ptr @gd
+  %2 = tail call double asm ".insn 0x4, 0x02000053 | (${0:N} << 7) | (${1:N} << 15) | (${2:N} << 20)", "=f,f,f"(double %a, double %1)
+  ret double %2
+}
+
+define double @constraint_cf_double(double %a) nounwind {
+; RV32F-LABEL: constraint_cf_double:
+; RV32F:       # %bb.0:
+; RV32F-NEXT:    addi sp, sp, -16
+; RV32F-NEXT:    sw a0, 8(sp)
+; RV32F-NEXT:    sw a1, 12(sp)
+; RV32F-NEXT:    fld fa5, 8(sp)
+; RV32F-NEXT:    lui a0, %hi(gd)
+; RV32F-NEXT:    fld fa4, %lo(gd)(a0)
+; RV32F-NEXT:    #APP
+; RV32F-NEXT:    .insn 0x4, 0x02000053 | (15 << 7) | (15 << 15) | (14 << 20)
+; RV32F-NEXT:    #NO_APP
+; RV32F-NEXT:    fsd fa5, 8(sp)
+; RV32F-NEXT:    lw a0, 8(sp)
+; RV32F-NEXT:    lw a1, 12(sp)
+; RV32F-NEXT:    addi sp, sp, 16
+; RV32F-NEXT:    ret
+;
+; RV64F-LABEL: constraint_cf_double:
+; RV64F:       # %bb.0:
+; RV64F-NEXT:    lui a1, %hi(gd)
+; RV64F-NEXT:    fld fa5, %lo(gd)(a1)
+; RV64F-NEXT:    fmv.d.x fa4, a0
+; RV64F-NEXT:    #APP
+; RV64F-NEXT:    .insn 0x4, 0x02000053 | (15 << 7) | (14 << 15) | (15 << 20)
+; RV64F-NEXT:    #NO_APP
+; RV64F-NEXT:    fmv.x.d a0, fa5
+; RV64F-NEXT:    ret
+  %1 = load double, ptr @gd
+  %2 = tail call double asm ".insn 0x4, 0x02000053 | (${0:N} << 7) | (${1:N} << 15) | (${2:N} << 20)", "=^cf,^cf,^cf"(double %a, double %1)
+  ret double %2
+}
+
+define double @constraint_f_double_abi_name(double %a) nounwind {
+; RV32F-LABEL: constraint_f_double_abi_name:
+; RV32F:       # %bb.0:
+; RV32F-NEXT:    addi sp, sp, -16
+; RV32F-NEXT:    sw a0, 8(sp)
+; RV32F-NEXT:    sw a1, 12(sp)
+; RV32F-NEXT:    fld fa1, 8(sp)
+; RV32F-NEXT:    lui a0, %hi(gd)
+; RV32F-NEXT:    fld fs0, %lo(gd)(a0)
+; RV32F-NEXT:    #APP
+; RV32F-NEXT:    .insn 0x4, 0x02000053 | (0 << 7) | (11 << 15) | (8 << 20)
+; RV32F-NEXT:    #NO_APP
+; RV32F-NEXT:    fsd ft0, 8(sp)
+; RV32F-NEXT:    lw a0, 8(sp)
+; RV32F-NEXT:    lw a1, 12(sp)
+; RV32F-NEXT:    addi sp, sp, 16
+; RV32F-NEXT:    ret
+;
+; RV64F-LABEL: constraint_f_double_abi_name:
+; RV64F:       # %bb.0:
+; RV64F-NEXT:    lui a1, %hi(gd)
+; RV64F-NEXT:    fld fs0, %lo(gd)(a1)
+; RV64F-NEXT:    fmv.d.x fa1, a0
+; RV64F-NEXT:    #APP
+; RV64F-NEXT:    .insn 0x4, 0x02000053 | (0 << 7) | (11 << 15) | (8 << 20)
+; RV64F-NEXT:    #NO_APP
+; RV64F-NEXT:    fmv.x.d a0, ft0
+; RV64F-NEXT:    ret
+  %1 = load double, ptr @gd
+  %2 = tail call double asm ".insn 0x4, 0x02000053 | (${0:N} << 7) | (${1:N} << 15) | (${2:N} << 20)", "={ft0},{fa1},{fs0}"(double %a, double %1)
+  ret double %2
+}
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-f-constraint-f.ll b/llvm/test/CodeGen/RISCV/inline-asm-f-constraint-f.ll
index 91922cd236dfff..a91c6544f9e29c 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-f-constraint-f.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-f-constraint-f.ll
@@ -1,5 +1,4 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; NOTE: Assertions gave been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -mtriple=riscv32 -mattr=+f -target-abi=ilp32 -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefix=RV32F %s
 ; RUN: llc -mtriple=riscv64 -mattr=+f -target-abi=lp64 -verify-machineinstrs < %s \
@@ -38,6 +37,33 @@ define float @constraint_f_float(float %a) nounwind {
   ret float %2
 }
 
+define float @constraint_cf_float(float %a) nounwind {
+; RV32F-LABEL: constraint_cf_float:
+; RV32F:       # %bb.0:
+; RV32F-NEXT:    lui a1, %hi(gf)
+; RV32F-NEXT:    flw fa5, %lo(gf)(a1)
+; RV32F-NEXT:    fmv.w.x fa4, a0
+; RV32F-NEXT:    #APP
+; RV32F-NEXT:    fadd.s fa5, fa4, fa5
+; RV32F-NEXT:    #NO_APP
+; RV32F-NEXT:    fmv.x.w a0, fa5
+; RV32F-NEXT:    ret
+;
+; RV64F-LABEL: constraint_cf_float:
+; RV64F:       # %bb.0:
+; RV64F-NEXT:    lui a1, %hi(gf)
+; RV64F-NEXT:    flw fa5, %lo(gf)(a1)
+; RV64F-NEXT:    fmv.w.x fa4, a0
+; RV64F-NEXT:    #APP
+; RV64F-NEXT:    fadd.s fa5, fa4, fa5
+; RV64F-NEXT:    #NO_APP
+; RV64F-NEXT:    fmv.x.w a0, fa5
+; RV64F-NEXT:    ret
+  %1 = load float, ptr @gf
+  %2 = tail call float asm "fadd.s $0, $1, $2", "=^cf,cf,cf"(float %a, float %1)
+  ret float %2
+}
+
 define float @constraint_f_float_abi_name(float %a) nounwind {
 ; RV32F-LABEL: constraint_f_float_abi_name:
 ; RV32F:       # %bb.0:
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-f-modifier-N.ll b/llvm/test/CodeGen/RISCV/inline-asm-f-modifier-N.ll
new file mode 100644
index 00000000000000..a0de5c71a7df6a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/inline-asm-f-modifier-N.ll
@@ -0,0 +1,96 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv32 -mattr=+f -target-abi=ilp32 -verify-machineinstrs -no-integrated-as < %s \
+; RUN:   | FileCheck -check-prefix=RV32F %s
+; RUN: llc -mtriple=riscv64 -mattr=+f -target-abi=lp64 -verify-machineinstrs -no-integrated-as < %s \
+; RUN:   | FileCheck -check-prefix=RV64F %s
+; RUN: llc -mtriple=riscv32 -mattr=+d -target-abi=ilp32 -verify-machineinstrs -no-integrated-as < %s \
+; RUN:   | FileCheck -check-prefix=RV32F %s
+; RUN: llc -mtriple=riscv64 -mattr=+d -target-abi=lp64 -verify-machineinstrs -no-integrated-as < %s \
+; RUN:   | FileCheck -check-prefix=RV64F %s
+
+;; `.insn 0x4, 0x53 | (${0:N} << 7) | (${1:N} << 15) | (${2:N} << 20)` is
+;; the raw encoding for `fadd.s`
+
+@gf = external global float
+
+define float @constraint_f_modifier_N_float(float %a) nounwind {
+; RV32F-LABEL: constraint_f_modifier_N_float:
+; RV32F:       # %bb.0:
+; RV32F-NEXT:    lui a1, %hi(gf)
+; RV32F-NEXT:    flw fa5, %lo(gf)(a1)
+; RV32F-NEXT:    fmv.w.x fa4, a0
+; RV32F-NEXT:    #APP
+; RV32F-NEXT:    .insn 0x4, 0x53 | (15 << 7) | (14 << 15) | (15 << 20)
+; RV32F-NEXT:    #NO_APP
+; RV32F-NEXT:    fmv.x.w a0, fa5
+; RV32F-NEXT:    ret
+;
+; RV64F-LABEL: constraint_f_modifier_N_float:
+; RV64F:       # %bb.0:
+; RV64F-NEXT:    lui a1, %hi(gf)
+; RV64F-NEXT:    flw fa5, %lo(gf)(a1)
+; RV64F-NEXT:    fmv.w.x fa4, a0
+; RV64F-NEXT:    #APP
+; RV64F-NEXT:    .insn 0x4, 0x53 | (15 << 7) | (14 << 15) | (15 << 20)
+; RV64F-NEXT:    #NO_APP
+; RV64F-NEXT:    fmv.x.w a0, fa5
+; RV64F-NEXT:    ret
+  %1 = load float, ptr @gf
+  %2 = tail call float asm ".insn 0x4, 0x53 | (${0:N} << 7) | (${1:N} << 15) | (${2:N} << 20)", "=f,f,f"(float %a, float %1)
+  ret float %2
+}
+
+
+define float @constraint_cf_modifier_N_float(float %a) nounwind {
+; RV32F-LABEL: constraint_cf_modifier_N_float:
+; RV32F:       # %bb.0:
+; RV32F-NEXT:    lui a1, %hi(gf)
+; RV32F-NEXT:    flw fa5, %lo(gf)(a1)
+; RV32F-NEXT:    fmv.w.x fa4, a0
+; RV32F-NEXT:    #APP
+; RV32F-NEXT:    .insn 0x4, 0x53 | (15 << 7) | (14 << 15) | (15 << 20)
+; RV32F-NEXT:    #NO_APP
+; RV32F-NEXT:    fmv.x.w a0, fa5
+; RV32F-NEXT:    ret
+;
+; RV64F-LABEL: constraint_cf_modifier_N_float:
+; RV64F:       # %bb.0:
+; RV64F-NEXT:    lui a1, %hi(gf)
+; RV64F-NEXT:    flw fa5, %lo(gf)(a1)
+; RV64F-NEXT:    fmv.w.x fa4, a0
+; RV64F-NEXT:    #APP
+; RV64F-NEXT:    .insn 0x4, 0x53 | (15 << 7) | (14 << 15) | (15 << 20)
+; RV64F-NEXT:    #NO_APP
+; RV64F-NEXT:    fmv.x.w a0, fa5
+; RV64F-NEXT:    ret
+  %1 = load float, ptr @gf
+  %2 = tail call float asm ".insn 0x4, 0x53 | (${0:N} << 7) | (${1:N} << 15) | (${2:N} << 20)", "=^cf,^cf,^cf"(float %a, float %1)
+  ret float %2
+}
+
+define float @modifier_N_float_abi_name(float %a) nounwind {
+; RV32F-LABEL: modifier_N_float_abi_name:
+; RV32F:       # %bb.0:
+; RV32F-NEXT:    lui a1, %hi(gf)
+; RV32F-NEXT:    flw fs0, %lo(gf)(a1)
+; RV32F-NEXT:    fmv.w.x fa0, a0
+; RV32F-NEXT:    #APP
+; RV32F-NEXT:    .insn 0x4, 0x53 | (0 << 7) | (10 << 15) | (8 << 20)
+; RV32F-NEXT:    #NO_APP
+; RV32F-NEXT:    fmv.x.w a0, ft0
+; RV32F-NEXT:    ret
+;
+; RV64F-LABEL: modifier_N_float_abi_name:
+; RV64F:       # %bb.0:
+; RV64F-NEXT:    lui a1, %hi(gf)
+; RV64F-NEXT:    flw fs0, %lo(gf)(a1)
+; RV64F-NEXT:    fmv.w.x fa0, a0
+; RV64F-NEXT:...
[truncated]

@lenary
Copy link
Member Author

lenary commented Oct 16, 2024

I spent some time trying to write a test that should fail before, but doesn't after, but it doesn't happen - I presume because x0 is reasonably dealt with by the allocator. This maybe suggests the patch isn't needed? I'm not sure why we do this for GPRs then.

@lenary
Copy link
Member Author

lenary commented Oct 16, 2024

(My approach with the test was to try to use all 32 GPRs, so have it fail when it couldn't use x0 - doing so by marking ra, sp, gp, and tp as clobbered, and passing 28 floats)

This change implements support for the `cr` and `cf` register
constraints (which allocate a RVC GPR or RVC FPR respectively), and the
`N` modifier (which prints the raw encoding of a register rather than the
name).

The intention behind these additions is to make it easier to use inline
assembly when assembling raw instructions that are not supported by the
compiler, for instance when experimenting with new instructions or when
supporting proprietary extensions outside the toolchain.

These implement part of my proposal in riscv-non-isa/riscv-c-api-doc#92

As part of the implementation, I felt there was not enough coverage of
inline assembly and the "in X" floating-point extensions, so I have
added more regression tests around these configurations.
I'm not sure if this fix is required, but I've written the patch anyway.
We can drop this commit if we don't think it's a bug. This does not
cause test changes, but we haven't got tests that try to use all 32
registers in inline assembly.

Broadly, for GPRs, we made the explicit choice that `r` constraints
would never attempt to use `x0`, because `x0` isn't really usable like
the other GPRs. I believe the same thing applies to `Zhinx`, `Zfinx` and
`Zdinx` because they should not be allocating operands to `x0` either,
so this patch introduces new `NoX0` classes for `GPRF16` and `GPRF32`
registers, and uses them with inline assembly. There is also a
`GPRPairNoX0` for the `Zdinx` case on rv32, avoiding use of the `x0`
pair which has different behaviour to the other GPR pairs.
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@topperc
Copy link
Collaborator

topperc commented Oct 18, 2024

I spent some time trying to write a test that should fail before, but doesn't after, but it doesn't happen - I presume because x0 is reasonably dealt with by the allocator. This maybe suggests the patch isn't needed? I'm not sure why we do this for GPRs then.

I think the original issue was copy propagation folding "mv a0, x0" into an inline assembly source as "x0". The instruction being used in the assembly was one that treated x0 as a special encoding. So there was a difference between x0 and a register containing the value 0. See #63747

@lenary
Copy link
Member Author

lenary commented Oct 18, 2024

Yeah, I went back and looked at that issue, it's helpful to know that it was more than just the register allocator, that it was also copy propagation, which wasn't obvious from the comments.

The failing tests are not related, but presumably caused by the rebase, but I'll still wait a bit to merge this.

topperc added a commit to topperc/llvm-project that referenced this pull request Oct 18, 2024
This would allow some inline assembly code work with either F or Zfinx.
This appears to match gcc behavior.

This will need to be adjust to exclude X0 after llvm#112563.
@lenary
Copy link
Member Author

lenary commented Oct 18, 2024

I'll still wait a bit to merge this.

Scratch that, merging now to unblock other PRs.

@lenary lenary merged commit 03dcd88 into llvm:main Oct 18, 2024
6 of 8 checks passed
@lenary lenary deleted the pr/riscv-inx-no-x0 branch October 18, 2024 21:33
topperc added a commit to topperc/llvm-project that referenced this pull request Oct 18, 2024
This would allow some inline assembly code work with either F or Zfinx.
This appears to match gcc behavior.

This will need to be adjust to exclude X0 after llvm#112563.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants