Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: invalid operand for instruction in arch/arm/vfp/vfpinstr.h #306

Closed
agners opened this issue Dec 30, 2018 · 17 comments
Closed

error: invalid operand for instruction in arch/arm/vfp/vfpinstr.h #306

agners opened this issue Dec 30, 2018 · 17 comments
Assignees
Labels
[ARCH] arm32 This bug impacts ARCH=arm [BUG] llvm A bug that should be fixed in upstream LLVM [FIXED][LLVM] 11 This bug was fixed in LLVM 11.0 [TOOL] integrated-as The issue is relevant to LLVM integrated assembler

Comments

@agners
Copy link

agners commented Dec 30, 2018

Compiling with integrated-as for c files fails for arch/arm/kernel/perf_event_v7.c.

arch/arm/kernel/perf_event_v7.c:1377:15: error: invalid operand for instruction
        asm volatile("mcr p10, 7, %0, c11, c0, 0" : : "r" (val));
                     ^
<inline asm>:1:6: note: instantiated into assembly here
        mcr p10, 7, r0, c11, c0, 0
            ^
arch/arm/kernel/perf_event_v7.c:1371:15: error: invalid operand for instruction
        asm volatile("mrc p10, 7, %0, c11, c0, 0" : "=r" (val));
                     ^
<inline asm>:1:6: note: instantiated into assembly here
        mrc p10, 7, r6, c11, c0, 0
            ^

Update: Changed this ticket to only reflect perf_event_v7.c problem. The issue of arch/arm/vfp/vfpmodule.c has been moved to #905.

@agners agners added [BUG] Untriaged Something isn't working [TOOL] integrated-as The issue is relevant to LLVM integrated assembler [ARCH] arm32 This bug impacts ARCH=arm labels Dec 30, 2018
@agners
Copy link
Author

agners commented Dec 30, 2018

It seems to me that the LLVM assembler really does not support p10 operand here. GNU assembler uses the following flags:

"/home/sag/armv7-eabihf--glibc--bleeding-edge-2018.11-1/bin/arm-buildroot-linux-gnueabihf-as" -EL -mfloat-abi=soft -march=armv7-a -mfpu=vfp -I ./arch/arm/include -I ./arch/arm/include/generated -I ./include -I ./arch/arm/include/uapi -I ./arch/arm/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -o arch/arm/vfp/vfpmodule.o /tmp/vfpmodule-0a8be4.s

@agners
Copy link
Author

agners commented Mar 6, 2019

It seems that Co-Processor 10 and 11 (we use p10 here) is not allowed on purpose (from lib/Target/ARM/AsmParser/ARMAsmParser.cpp):

/// parseCoprocNumOperand - Try to parse an coprocessor number operand. The
/// token must be an Identifier when called, and if it is a coprocessor
/// number, the token is eaten and the operand is added to the operand list.
OperandMatchResultTy
ARMAsmParser::parseCoprocNumOperand(OperandVector &Operands) {
  MCAsmParser &Parser = getParser();
  SMLoc S = Parser.getTok().getLoc();
  const AsmToken &Tok = Parser.getTok();
  if (Tok.isNot(AsmToken::Identifier))
    return MatchOperand_NoMatch;

  int Num = MatchCoprocessorOperandName(Tok.getString(), 'p');
  if (Num == -1)
    return MatchOperand_NoMatch;
  // ARMv7 and v8 don't allow cp10/cp11 due to VFP/NEON specific instructions
  if ((hasV7Ops() || hasV8Ops()) && (Num == 10 || Num == 11))
    return MatchOperand_NoMatch;

  Parser.Lex(); // Eat identifier token.
  Operands.push_back(ARMOperand::CreateCoprocNum(Num, S));
  return MatchOperand_Success;
}

We probably should just use the VFP/NEON specific instructions I guess?

@smithp35 any thoughts on this?

@smithp35
Copy link

smithp35 commented Mar 6, 2019

There is a long discussion on https://llvm.org/pr20025 which is the source of those lines. Beforehand no architecture was permitted to use 10 and 11.

As far as I can tell there isn't anything in what counts for the assembly language specification that prohibits use of 10 and 11 on ARMv7 and above. It seems to be a policy choice to prevent someone from misusing them given that any use for something else would conflict with Neon/VFP and given that specific instructions exist why not use them.

Personally if you are able to use the new instructions then I suggest doing so. There is a case where code has to assemble on multiple architectures even if they don't support the instructions, which is mentioned in the PR. For that case this could be worked around by using a .arch directive to locally change to a pre-armv7-a.

If all that fails then I guess we'll need to provide a good enough argument as to why the restriction should be removed. Personally I tend to prefer Postel's law of be liberal in what you accept and be specific in what you output, but I may be in the minority.

@agners
Copy link
Author

agners commented Mar 6, 2019

Dropping the cp10/cp11 check in LLVM allows me to compile and assemble the complete kernel using Clang/LLVM (all c files, for the assembly files I still use integrated-as). The kernel also boots fine.

There are about 5 macros or inline assembly preprocessor defines which use mcr/mrc p10 in the kernel. Those macros and preprocessor defines are then used by various call sites. Since call sites might use different compile flags (e.g. not have -mfpu=vfp enabled) it seems a rather tedious task to replace the cp10 calls with Neon/VFP specific instructions...

Given that allowing cp10/11 in LLVM is rather trivial, I'd rather prefer to go this route (maybe behind an option)?

Just for clarity: E.g. the cp10, 7 instructions can be replace with VMRS/VMSR assembler instructions. As far as I understand they lead to the same encoding anyway, is that correct?

The Arm ARM describes the spec_reg argument, which can be one of FPSID, FPSCR, FPEXC for VMSR and one of FPSID, FPSCR, MVFR1, MVFR0, FPEXC for VMRS. The kernel also uses FPINST and FPINST (defines as cr9 and cr10) Furthermore, in arch/arm/kernel/perf_event_v7.c c11 is used:

static u32 venum_read_pmresr(void)
{
        u32 val;
        asm volatile("mrc p10, 7, %0, c11, c0, 0" : "=r" (val));
        return val;
}

static void venum_write_pmresr(u32 val)
{
        asm volatile("mcr p10, 7, %0, c11, c0, 0" : : "r" (val));
}

@agners
Copy link
Author

agners commented Mar 6, 2019

In a quick test I use vmsr/vmrs in arch/arm/vfp/vfpinstr.h. It seems that LLVM does not like some of the instructions:

arch/arm/vfp/vfpmodule.c:345:2: error: invalid instruction
        fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
        ^
arch/arm/vfp/vfpinstr.h:82:6: note: expanded from macro 'fmxr'
        asm("vmsr " vfpreg(_vfp_) ", r0 @ fmxr  " #_vfp_ ", %0" \
            ^
<inline asm>:1:2: note: instantiated into assembly here
        vmsr cr8, r0 @ fmxr     FPEXC, r0
        ^

This is compiled with -marm -march=armv7-a -mfpu=vfp (but also with -msoft-float sure if relevant?).

@smithp35
Copy link

smithp35 commented Mar 7, 2019

Yes the co-processor instructions with cp10 and cp11 will assemble to the same bitpattern. You should be able to confirm this by disassembling the object. The disassembler will prefer the VFP decoding.

The vmsr is complaining about cr8, clang only accepts the <spec_reg> form of FPEXC whereas gcc will accept cr8. The Arm ARM doesn't mention anything about cr8, all is says is FPEXC reg='1000' so in this case I don't think that there is a problem with clang.

For removing the restriction on cp10 and cp11 I think that is worth having the debate upstream. rengolin was the author and it will be worth including him on any review.

@smithp35
Copy link

smithp35 commented Mar 7, 2019

Another possibility if there is resistance to removing the check might be to issue a warning if cp10 and cp11 are used on v7, something like "Warning, cp10 and cp11 are reserved for Floating-Point and Advanced SIMD extensions." that should catch mistakes as well as permitting system level software that needs to assemble on multiple architectures.

@agners
Copy link
Author

agners commented Mar 17, 2019

Thanks for the hints, using vmsr r0, FPEXC assembles fine with LLVM and GCC. Also found some other instances which need rework e.g. this ldc instructions need replacement with vldr:

/tmp/vfphw-03dfb7.s:553:2: note: while in macro instantiation
 VFPFLDMIA r10, r5 @ reload the working registers while
 ^
<instantiation>:9:9: error: invalid operand for instruction
 ldcleq p11, cr0, [r10],#32*4 @ FLDMIAD r10!, {d16-d31}
        ^

It seems that those registers fail the same check in LLVM (parseCoprocNumOperand).

I still have some instances where I am not sure what the correct replacements are, e.g. the vfp_(get|put)_(float|double) in arch/arm/vfp/vfphw.S.

Typically Linux tries to be warn-free, so if we add a warning there should be an option to explicitly disable it... Where is the right place to start a debate on this upstream?

@agners
Copy link
Author

agners commented Mar 18, 2019

A similar case are deprecated warnings when using co-processor instructions for data-barriers:

  CC      arch/arm/plat-versatile/hotplug.o                                                                                                                                                                       
arch/arm/plat-versatile/hotplug.c:29:3: warning: deprecated since v7, use 'dsb' [-Winline-asm]                                                                                                                    
        "       mcr     p15, 0, %1, c7, c10, 4\n"                                                                                                                                                                 
         ^                                                                                                                                                                                                        
<inline asm>:2:2: note: instantiated into assembly here                                                                                                                                                           
        mcr     p15, 0, r0, c7, c10, 4                                                                                                                                                                            
        ^                                                                                                                                                                                                         
1 warning generated.

Those warnings can be suppressed using -no-deprecated-warn.

It almost seems it would be more consistent to convert the VFP case into a similar warning. The patch is actually rather trivial:

Index: lib/Target/ARM/AsmParser/ARMAsmParser.cpp
===================================================================
--- lib/Target/ARM/AsmParser/ARMAsmParser.cpp   (revision 355280)
+++ lib/Target/ARM/AsmParser/ARMAsmParser.cpp   (working copy)
@@ -3665,9 +3665,6 @@
   int Num = MatchCoprocessorOperandName(Tok.getString(), 'p');
   if (Num == -1)
     return MatchOperand_NoMatch;
-  // ARMv7 and v8 don't allow cp10/cp11 due to VFP/NEON specific instructions
-  if ((hasV7Ops() || hasV8Ops()) && (Num == 10 || Num == 11))
-    return MatchOperand_NoMatch;

   Parser.Lex(); // Eat identifier token.
   Operands.push_back(ARMOperand::CreateCoprocNum(Num, S));
Index: lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp
===================================================================
--- lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp     (revision 355280)
+++ lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp     (working copy)
@@ -63,6 +63,12 @@
       return true;
     }
   }
+  if (STI.getFeatureBits()[llvm::ARM::HasV7Ops] &&
+      ((MI.getOperand(0).isImm() && MI.getOperand(0).getImm() == 10) ||
+       (MI.getOperand(0).isImm() && MI.getOperand(0).getImm() == 11))) {
+      Info = "deprecated since v7, use advanced SIMD or floating point instructions";
+      return true;
+  }
   return false;
 }

@smithp35 thoughts?

@smithp35
Copy link

I don't think we can use deprecated here as it has a specific meaning in the Arm ARM and there isn't anywhere in the v7 or v8 Arm ARM that says using the coprocessor mnemonics is deprecated. In the other case you mention for "mcr p15, 0, r0, c7, c10, 4" this is a deprecated encoding as well as mnemonic and is specifically mentioned in the v7 Arm ARM as deprecated.

I think we could say: Info = "since v7, cp10 and cp11 are reserved for advanced SIMD and floating point instructions";

The instructions in vfp_get_float and vfp_put_float should correspond to various forms of VMOV
// Get and put float (s for single precision register)
VMOV r0,s0
VMOV s0,r0
// Get and put double (d for single precision register) fmrrd fmdrr equivalent
VMOV r0,r1,d0
VMOV d0,r0,r1
// mcrr mrcc
VMOV r0,r1,d16
VMOV d16,r0,r1

For the debate upstream I think it is niche enough to address with a patch to LLVM that implements the behaviour with supporting examples from the kernel. I'd think the following set of reviewers would be enough: rengolin, olista01, t.p.northover, efriedma

@agners
Copy link
Author

agners commented Mar 23, 2019

Created a patch for LLVM: https://reviews.llvm.org/D59733

@nickdesaulniers nickdesaulniers added [BUG] llvm A bug that should be fixed in upstream LLVM [PATCH] Submitted A patch has been submitted for review and removed [BUG] Untriaged Something isn't working labels Mar 25, 2019
@nickdesaulniers
Copy link
Member

tentatively marking as a bug in LLVM; feel free to change labels if you disagree.

@nickdesaulniers
Copy link
Member

@agners , any plans to update based on @smithp35 's comments?

@agners
Copy link
Author

agners commented Feb 2, 2020

Hm, there have been related changes nearby: https://reviews.llvm.org/D63863

Working on a rebased version.

@agners
Copy link
Author

agners commented Feb 24, 2020

I reviewed this case a bit since we discussed alternative solutions by using assembly directives such as .arch or .fpu to solve issue like this.

However, there is one case which I think there won't be a solution in arch/arm/kernel/perf_event_v7.c:

static u32 venum_read_pmresr(void)
{
        u32 val;
        asm volatile("mrc p10, 7, %0, c11, c0, 0" : "=r" (val));
        return val;
}

static void venum_write_pmresr(u32 val)
{
        asm volatile("mcr p10, 7, %0, c11, c0, 0" : : "r" (val));
}

This leads to:

arch/arm/kernel/perf_event_v7.c:1377:15: error: invalid operand for instruction
        asm volatile("mcr p10, 7, %0, c11, c0, 0" : : "r" (val));
                     ^
<inline asm>:1:6: note: instantiated into assembly here
        mcr p10, 7, r0, c11, c0, 0
            ^

GNU objdump disassembles this to:

    1054:       eefb 3a10       vmrs    r3, <impl def 0xb>

@arndb any thoughts?

@agners agners changed the title invalid operand for instruction in arch/arm/vfp/vfpinstr.h error: invalid operand for instruction in arch/arm/vfp/vfpinstr.h Feb 24, 2020
@nickdesaulniers
Copy link
Member

nickdesaulniers commented Jun 24, 2020

Hey @agners landed https://reviews.llvm.org/D59733! Marking this as fixed? (or is there more to do here?)

@nickdesaulniers nickdesaulniers removed the [PATCH] Submitted A patch has been submitted for review label Jun 24, 2020
@nickdesaulniers nickdesaulniers added the [FIXED][LLVM] 11 This bug was fixed in LLVM 11.0 label Jun 24, 2020
@agners
Copy link
Author

agners commented Jun 24, 2020

@nickdesaulniers yes this can be marked solved. I plan to send a related patch to the Linux ML as well which uses new mnemonics where available/possible, but I track this with a separate bug #905 already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[ARCH] arm32 This bug impacts ARCH=arm [BUG] llvm A bug that should be fixed in upstream LLVM [FIXED][LLVM] 11 This bug was fixed in LLVM 11.0 [TOOL] integrated-as The issue is relevant to LLVM integrated assembler
Projects
None yet
Development

No branches or pull requests

3 participants