Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support big- and little-endian lane order with bitcast #5196

Merged
merged 1 commit into from
Nov 7, 2022

Conversation

uweigand
Copy link
Member

@uweigand uweigand commented Nov 4, 2022

Add a MemFlags operand to the bitcast instruction, where only the big and little flags are accepted. These define the lane order to be used when casting between types of different lane counts.

Update all users to pass an appropriate MemFlags argument.

Implement lane swaps where necessary in the s390x back-end.

This is the final part necessary to fix
#4566.

CC @cfallin

This is still a RFC because of the following questions:

  • Should we enforce that a byte order flag must be explicitly specified if the types differ in lane count? (This would be easily doable in the verifier.)
  • There was some discussion in the issue whether or not the MemFlags API is the right choice. It seems that way to me, since it makes the semantics easy to specific, and is simple to implement based on the already existing MemFlags infrastructure. If you prefer some other implementation, please let me know - I'd be happy to change this.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift:wasm labels Nov 4, 2022
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks -- this is surprisingly simple; I'm happy how cleanly it came out. The simple definition/derivation of bitcast's semantics via store+load combines really nicely with our endianness support generally.

I think using a restricted subset of MemFlags is fine for that reason. (Instruction format too; at first LoadNoOffset seemed a bit odd but none of the instruction predicates come directly from the format (e.g. can_load) and so I think it's better than introducing another enum variant in the end.)

I do think we should require one of big or little when lane-count differs, as you suggest; otherwise this is a new way that endian-specific behavior is visible in CLIF, which I'd prefer to avoid. Happy to r+ once that is added!

@@ -3113,11 +3114,15 @@ pub(crate) fn define(

The input and output types must be storable to memory and of the same
size. A bitcast is equivalent to storing one type and loading the other
type from the same address.
type from the same address, using the specified MemFlags.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/using/both using/, just for clarity (other possibilities here include storing with one MemFlags and loading with another, which isn't what we're emulating but someone might imagine this instead)

Add a MemFlags operand to the bitcast instruction, where only the
`big` and `little` flags are accepted.  These define the lane order
to be used when casting between types of different lane counts.

Update all users to pass an appropriate MemFlags argument.

Implement lane swaps where necessary in the s390x back-end.

This is the final part necessary to fix
bytecodealliance#4566.
@uweigand uweigand changed the title [RFC] Support big- and little-endian lane order with bitcast Support big- and little-endian lane order with bitcast Nov 5, 2022
@uweigand
Copy link
Member Author

uweigand commented Nov 5, 2022

I do think we should require one of big or little when lane-count differs, as you suggest; otherwise this is a new way that endian-specific behavior is visible in CLIF, which I'd prefer to avoid. Happy to r+ once that is added!

Ok, implemented. There were four filetests where the error triggered, which I fixed by adding a byte order specifier. Thanks!

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with changes, thanks!

@cfallin cfallin merged commit 3e5938e into bytecodealliance:main Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift:wasm cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants