Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIR: asm statement support, Wip #22992

Closed
wants to merge 36 commits into from
Closed

NIR: asm statement support, Wip #22992

wants to merge 36 commits into from

Conversation

ASVIEST
Copy link
Contributor

@ASVIEST ASVIEST commented Nov 27, 2023

GlobalAsm instr is basic asm, which is just str literals or some instrs

Asm instr is GCC extended asm
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Asm {
  AsmTemplate {
     Some asm code
     SymUse nimInlineVar # `a`
     Some asm code
   }
  AsmOutputOperand {
     # [asmSymbolicName] constraint (nimVariableName)
     AsmInjectExpr {symUse nimVariableName} # for output it have only one sym (lvalue)
     asmSymbolicName # default: ""
     constraint
  }
  AsmInputOperand {
     # [asmSymbolicName] constraint (nimExpr)
     AsmInjectExpr {symUse nimVariableName} # (rvalue)
     asmSymbolicName # default: ""
     constraint
  }
  AsmClobber {
    "clobber"
  }
}

This structure is simple for codegens (including those what may be in the future like libgccjit or LLVM).
Also it can produce some nim errors without dependencies of backends.

@Araq
Copy link
Member

Araq commented Nov 28, 2023

Btw here is my NIR todo:

NIR TODO:

  • Implement closure calls.
  • Port block exiting logic from the C codegen.
  • Port parameter passing and NRVO.
  • Port 3 opcodes to enable closure iterators.
  • Port Runtime type information from the C codegen.
  • Port the few missing magics from the C codegen.

@ASVIEST
Copy link
Contributor Author

ASVIEST commented Nov 28, 2023

# asm (for gcc extended asm it includes also AsmTemplate, AsmOutOperand*, AsmInOperand*)
Emit {
  EmitTarget "Asm"
  Info {
    InfoKind "IsGlobal"
    false
  }

  # only GCC extended asm info.
  # It can be used by backend or not used.
  Info {
    InfoKind "AsmTemplate"
    Verbatim """
      mov %1, %0
      add $1, %0
    """
  }
  # there can be any number of operands
  Info {
   InfoKind "AsmOutOperand"
   SymUse r
   Verbatim ""
   Verbatim "=r"
  }
  Info {
   InfoKind "AsmInOperand"
   SymUse r
   Verbatim ""
   Verbatim "r"
  }

  EmitCode {
    Verbatim """
      mov %1, %0
      add $1, %0 
      :"=r"(
    """
    SymUse result
    Verbatim """
      )
      :"r"(
    """
    SymUse dst
    Verbatim ")"
   """
  }

When using gcc extended asm, it contains useful information for backends.
The EmitCode instruction directly emits code that is a list of Verbatim, SymUse, etc. into c code (as an example).

What do you think about this design ?

@Araq
Copy link
Member

Araq commented Nov 28, 2023

What do you think about this design ?

It's wrong. You effectively do the GCC specific asm parsing in the Nim frontend, but it belongs into the "NIR to C" backend part. Only the Nim specific parsing should be done in the frontend.

@ASVIEST ASVIEST marked this pull request as draft November 29, 2023 18:19
compiler/nir/nirinsts.nim Outdated Show resolved Hide resolved
@Araq
Copy link
Member

Araq commented Nov 30, 2023

Already 600 lines that accomplishes what exactly? Parsing the clobber lists of GCC's unreadable inline assembler in order to pass it back to GCC to parse it for itself. Sorry, but this is terrible.

@ASVIEST
Copy link
Contributor Author

ASVIEST commented Nov 30, 2023

The initial goal was to help future backends like LLVM or libgccjit in parsing the extended assembler (as auxiliary information) when it is available, since there is no parsing of the assembler itself. It is more logical to do it not on the frontend, but perhaps on an intermediate stage before the backend (Nir -> Nir) that has information about basic things, such as for example assembler syntax (gcc or Visual C++), or on the backend, which is worse. Asm parsing is mainly aimed at this future backends(LLVM, libgccjit), but it have some pluses on other targets (C, C++).

The main plus for C(++) purposes is that the parsing is done on the Nim side, not the compiler, so it can then be parsed, e.g. not allowing such code

proc test(a: int64) =
  asm """
    add %[val], %[a]
    :[a]"=r"(`a`)
    :[val]"r"((long long)(4/2))
  """

var a: int64 = 35
test(a)
echo a

This code compiles (although it shouldn't because it tries to change a parameter), but it can't change the variable a (it stays with the value 35), which is good, but the fact that nim's code is compiled (with invalid behavour) is not pleasant.

proc test(a: var int64) =
  asm """
    add %[val], %[a]
    :[a]"=r"(`a`)
    :[val]"r"((long long)(4/2))
  """

var a: int64 = 35
test(a)
echo a

Here the code compiles again, but again it doesn't change a (for C through GCC), or when compiling in C++ through GCC, a becomes random number.
Because of this, you should write this code instead of the previous one:

proc test(a: var int64) =
  var b = a
  asm """
    add %[val], %[a]
    :[a]"=r"(`b`)
    :[val]"r"((long long)(4/2))
  """
  a = b

var a: int64 = 35
test(a)
echo a 

Now it works as it should.

Analysis of gcc extended asm also move some C, C++ compiler exceptions to Nir, which means they can be get in nimsuggest, etc.

Without such analysis, adding support for such cases is simply not possible because the Nim compiler will think of assembly language as just a black box, not knowing what is being changed, etc.
About repetitive parsing of assembler in GCC(or clang) and in Nim is completely true, but it is the simplest thing in all inline asm, the main complexity of it in the interaction of the built-in assembler with the compiler-generated assembler and code, for example, a jump in the assembler should not violate the space of compiler-generated labels, etc.
See bytecodealliance/wasmtime#1041 (comment).
It's really hard work, but not parsing.

@Araq
Copy link
Member

Araq commented Dec 1, 2023

Ok, well, currently not even "hello world" works with NIR, so can we focus on the essentials please before adding superior logic for inline asm...

@ASVIEST
Copy link
Contributor Author

ASVIEST commented Dec 1, 2023

Ok, then I'll do it without parsing and analyzing the inline asm, for now, and keep the parsing part in a separate file for when NIR is more complete and there's time to add support for assembler analyzing.

@arnetheduck
Copy link
Contributor

fwiw, I wanted to do this for nlvm which would benefit from proper asm parsing (so it can set up the correct constraints like gcc does from the nim-inlined asm code) - if the asm parser can be separate, that would certainly be helpful

@ASVIEST
Copy link
Contributor Author

ASVIEST commented Dec 1, 2023

You can't just import. Keep in mind that this only works with NIR instructions (previously with ast, but now it is not supported), and not with strings, since the asm stategment may contain nim symbols and expressions. Therefore, it will be more correct to use this when nlvm generates llvm IR from NIR, and not from ast. For now you can take gcc_extended_asm.nim and replace it's tokenizer by genasm.nim tokenizer (parseAsm iterator). Note: it don't parses labels.

compiler/nir/nirinsts.nim Outdated Show resolved Hide resolved
@ASVIEST ASVIEST marked this pull request as ready for review December 4, 2023 18:51
@ASVIEST
Copy link
Contributor Author

ASVIEST commented Dec 8, 2023

Currently this can generate code for GCC and VCC inline asm, this can be selected via the -a gcc-like(or msvc-like) flag on nirc. The basic assembler also works. noAsmStackFrame also works, instead of generate __declspec or attribute naked it generates NIM_NAKED C macro. Assembler parsing is separated into a separated file, which is not currently used. Assembly analysis will be offered as rfc.

RFC: nim-lang/RFCs#542

@ASVIEST ASVIEST closed this Dec 8, 2023
@ASVIEST ASVIEST reopened this Dec 8, 2023
@Araq
Copy link
Member

Araq commented Dec 9, 2023

Currently this can generate code for GCC and VCC inline asm, this can be selected via the -a gcc-like(or msvc-like) flag on nirc.

The used inline assembler format should be a pragma though, not a command line option and nimrc should have a dialect switch so that we can say "produce C code compatible with VCC etc".

@ASVIEST
Copy link
Contributor Author

ASVIEST commented Dec 12, 2023

Currently this can generate code for GCC and VCC inline asm, this can be selected via the -a gcc-like(or msvc-like) flag on nirc.

The used inline assembler format should be a pragma though, not a command line option and nimrc should have a dialect switch so that we can say "produce C code compatible with VCC etc".

-a flag say "can produce C code from inline asm statements compatible with VCC etc".
If there will be only dialect switch (without -a flag), what dialect will be for example ICC (intel C compiler) ? How dialect can show that C compiler don't support inline asm (like some tcc builds) ? If for ICC etc. it is a different dialect, then the codegen will have to check all dialects, etc. I think it is better to leave the -a flag for customization and add the targetCC (or dialect) flag, which sets the default settings for the selected compiler.

The pragma idea is good for deciding what syntax to generate assembler code with, for example for ICC, the assembler can be either gcc style or vcc style. If code uses an procedure with assembler from an external library, the assembler stmt must have a pragma that explains it gcc(AT&T) or vcc(intel), then it can use libraries that that use gcc or vcc syntax.

@ASVIEST
Copy link
Contributor Author

ASVIEST commented Dec 12, 2023

Now it can resolve asm syntax kind. For example backend Inline asm syntax is {"gcc", "vcc"} and via {.inlineAsmSyntax: "gcc".} can specify that asm stmt with gcc syntax.
NOTE: this pragma not required when CC support only one syntax: gcc/vcc, but recommend to make code more universal.

AsmInjectExpr
AsmStrVal

GccAsmNode* = ref object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not how we write ASTs in Nim 2023. We now use "packed trees". The files you have touched can guide you.

build c.code, info, EmitCode:
for i in offset..<n.len:
let it = n[i]
case it.kind:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not how the rest of the codebase indents the case statement.

@@ -31,6 +32,7 @@ type
Semicolon = ";"
Comma = ", "
Space = " "
Quote = $'"'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Quote = $'"'
Quote = "\""

left = 0
for i in 0..s.high:
if s[i] == '\n':
c.add s[left..i - 1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What? Why? Didn't you write a parser for the asm? Shouldn't it split it at newlines if they are important?


var asmTemplate = true
var left = 0
var s = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is s mutable?

assert (
isLastSon(t, code, code.firstSon) and
t[code.firstSon].kind == Verbatim
), "Invalid basic asm. Basic asm should be only one verbatim"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the frontend do this?

of Code:
raiseAssert"not supported"

of EmitTarget:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a EmitTarget an opcode of its own? We have a pragma annotation system in place for this...

@Araq
Copy link
Member

Araq commented Dec 15, 2023

700 lines we have to maintain for good all the while "NIR" cannot even do "hello world". Enough of this.

@Araq Araq closed this Dec 15, 2023
@Araq
Copy link
Member

Araq commented Dec 15, 2023

Reopen it once NIR can do something useful at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants